{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "import scipy.cluster.hierarchy as shc\n", "\n", "from sklearn.preprocessing import MinMaxScaler\n", "\n", "from sklearn.cluster import AgglomerativeClustering\n", "from sklearn.cluster import KMeans\n", "\n", "from sklearn.metrics import confusion_matrix\n", "from sklearn.metrics import silhouette_score\n", "\n", "from sklearn import datasets\n", "\n", "%matplotlib inline\n", "pd.set_option(\"display.max_columns\", None)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 22 - Determining the number of clusters\n", "\n", "We will look at two methods for determining the number of clusters. \n", "\n", "## Inertia and the elbow method\n", "The first method assmes you have centers for the clusters, as in k-means clustering. It computes the sum of the squared distances of samples to their closest cluster center.\n", "\n", "We'll load the iris dataset, as in Lab 20." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2
24.73.21.30.2
34.63.11.50.2
45.03.61.40.2
\n", "
" ], "text/plain": [ " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n", "0 5.1 3.5 1.4 0.2\n", "1 4.9 3.0 1.4 0.2\n", "2 4.7 3.2 1.3 0.2\n", "3 4.6 3.1 1.5 0.2\n", "4 5.0 3.6 1.4 0.2" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iris_dict = datasets.load_iris()\n", "\n", "iris = pd.DataFrame(iris_dict.data, columns = iris_dict.feature_names)\n", "iris.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Scale the data columns to be between 0 and 1." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "scaler = MinMaxScaler(feature_range = (0,1))\n", "iris_scaled = scaler.fit_transform(iris)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use k-means with k = 3 to compute the clusters. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "kmeans = KMeans(n_clusters = 3)\n", "kmeans_cluster = kmeans.fit_predict(iris_scaled)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n", " 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,\n", " 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1,\n", " 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 2, 2, 1, 1, 1, 1,\n", " 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 2], dtype=int32)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kmeans_cluster" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the sum of the squared distance of the samples to their closest cluster center as follows (`kmeans` should be the variable holding information about the k-means clustering algorithm)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.982216473785234" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kmeans.inertia_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To find the best k value, we make a loop to compute the inertia for each k, storing the result in a list." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "inertia_list = []\n", "for k in range(1,11):\n", " kmeans = KMeans(n_clusters=k, random_state=0)\n", " kmeans_clusters = kmeans.fit_predict(iris_scaled)\n", " inertia_list.append(kmeans.inertia_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the values in inertia_list. You can use `range(1,11)` as the x values." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAHQ1JREFUeJzt3Xl0XOd53/HvMwv2bUiCFEUSAy20ZMk2SQCWtbl1ZKvVcRxLSp3YbqKoPW4Vx5ZsJz6J7T966qSpq2xeEi81IylSbUW2a8uRrSppZVmOKy9SQZDURtnaCJAUF5DERuwYPP1jLkCAAoQBOIM7c+f3OWfOnblzB/fRHPF333nve+9r7o6IiJS+WNgFiIhIfijQRUQiQoEuIhIRCnQRkYhQoIuIRIQCXUQkIhToIiIRoUAXEYkIBbqISEQkVnNn69at89bW1tXcpYhIydu1a9dxd29eartVDfTW1lY6OztXc5ciIiXPzLpz2U5dLiIiEaFAFxGJCAW6iEhEKNBFRCJCgS4iEhEKdBGRiFCgi4hEREkE+qO/OMaXf/RC2GWIiBS1kgj0n75wnM//4HnGpzJhlyIiUrRKItDb0ykmpqZ5+tBg2KWIiBStkgj0tnQKgK7uvpArEREpXjkHupnFzWy3mT0YvD7PzB43sxfM7JtmVlGoItfXV9GypoZdCnQRkUUtp4X+UWDfnNd/BnzO3S8E+oAP5LOwM7WnU3R29+HuhdyNiEjJyinQzWwz8KvAHcFrA64Bvh1scg9wQyEKnNGWTnH81DgHTo4WcjciIiUr1xb654E/AqaD12uBfnefCl4fBDYt9EEzu8XMOs2ss7e3d8WFdgT96Lt6Tq74b4iIRNmSgW5m7wKOufuulezA3Xe6e4e7dzQ3L3l/9kW9bkM9dZUJ9aOLiCwilwkurgLebWbvBKqABuALQJOZJYJW+mbgUOHKhHjM2NHSROd+BbqIyEKWbKG7+6fcfbO7twLvA37o7r8FPAq8J9jsZuCBglUZaGtJ8YujQwyNTRZ6VyIiJedsxqF/AvgDM3uBbJ/6nfkpaXEdrSncYc+B/kLvSkSk5CxrTlF3/xHwo+D5S8Bl+S9pcdu3NGEGu7r7eOvWlffHi4hEUUlcKTqjvirJRRvqdWJURGQBJRXokL3AaHdPP5lpXWAkIjJXyQV6R2uKU+NT/PLoUNiliIgUlZIL9PaWNQDqdhEROUPJBfqWNdWsq6tUoIuInKHkAt3MaE83KdBFRM5QcoEO0JFeQ8/JEY4NjYVdiohI0SjJQD894YUuMBIRmVGSgf6GTQ1UxGPs6tadF0VEZpRkoFcm4rxxc6P60UVE5ijJQIfsBUZPHxpkbDITdikiIkWhpAN9IjPNM68MhF2KiEhRKNlAb2vJnhjV/dFFRLJKNtCb6ytJr61RP7qISKBkAx2gvSVFV08f7rpRl4hIaQd6a4rjpyboOTkSdikiIqHLZZLoKjN7wsz2mtkzZvbHwfq7zexlM9sTPLYXvtz52tPqRxcRmZFLC30cuMbdtwHbgevM7PLgvT909+3BY0/BqlzE1vX11Fcm2NWjQBcRWXIKOs92UJ8KXiaDR1F0WsdjxvaWJrp0YlREJLc+dDOLm9ke4BjwsLs/Hrz1X83sSTP7nJlVFqzK19CRXsMvjg4xODYZxu5FRIpGToHu7hl33w5sBi4zszcAnwIuBt4MrAE+sdBnzewWM+s0s87e3t48lX1aezqFO+zu0Y26RKS8LWuUi7v3A48C17n7Yc8aB/4OuGyRz+x09w5372hubj77is+wbUsjMdMMRiIiuYxyaTazpuB5NXAt8JyZbQzWGXAD8HQhC11MfVWSi85pUD+6iJS9JU+KAhuBe8wsTvYA8C13f9DMfmhmzYABe4APFrDO19SRTnF/10Ey0048ZmGVISISqlxGuTwJ7Fhg/TUFqWgF2tMpvvbzbp47Msil5zaGXY6ISChK+krRGe2zMxip20VEylckAn1zqprm+kqdGBWRshaJQDczOtIpXTEqImUtEoEO2W6XAydHOTY4FnYpIiKhiEygtwX96Op2EZFyFZlAv/TcBioSMQW6iJStyAR6ZSLOts2N6kcXkbIVmUCHbLfL04cGGJvMhF2KiMiqi1Sgt7ekmMw4Tx0aCLsUEZFVF6lA14lRESlnkQr0dXWVnLeuVoEuImUpUoEO0NaSoqu7j+xESyIi5SNygd6eTnFieIL9J0bCLkVEZFVFMtBB/egiUn4iF+hb19dRX5VQoItI2YlcoMdiNtuPLiJSTiIX6JDtdvnlsSEGRifDLkVEZNXkMqdolZk9YWZ7zewZM/vjYP15Zva4mb1gZt80s4rCl5ub9nQKd9it2wCISBnJpYU+Dlzj7tuA7cB1ZnY58GfA59z9QqAP+EDhylye7VuaiJlmMBKR8rJkoHvWqeBlMng4cA3w7WD9PcANBalwBWorE7x+Y4Nu1CUiZSWnPnQzi5vZHuAY8DDwItDv7lPBJgeBTYt89hYz6zSzzt7e3nzUnJP2dIrdPf1MZaZXbZ8iImHKKdDdPePu24HNwGXAxbnuwN13unuHu3c0NzevsMzla0+nGJnI8NyRoVXbp4hImJY1ysXd+4FHgSuAJjNLBG9tBg7lubaz0taSvcCoS90uIlImchnl0mxmTcHzauBaYB/ZYH9PsNnNwAOFKnIlNqeq2dBQqQuMRKRsJJbehI3APWYWJ3sA+Ja7P2hmzwLfMLM/BXYDdxawzmUzM9rTKTr3K9BFpDwsGeju/iSwY4H1L5HtTy9abS0pHnrqCEcGxjinsSrsckRECiqSV4rOmLlRl/rRRaQcRDrQLz23kcpETP3oIlIWIh3oFYkY2zY30alAF5EyEOlAh+w8o88cGmBsMhN2KSIiBRX5QG9Pp5iadp48OBB2KSIiBVUWgQ6awUhEoi/ygb6mtoLz19Wyq/tk2KWIiBRU5AMdsv3ou7r7cPewSxERKZiyCPT2dIq+kUlePj4cdikiIgVTFoHeoX50ESkDZRHoFzTX0VCVUKCLSKSVRaDHYjbbjy4iElVlEegA7S0pnj92ioGRybBLEREpiPIJ9NbgRl0H1EoXkWgqm0DftrmJeMzYpfuji0hElU2g11YmeP3GevWji0hk5TIF3RYze9TMnjWzZ8zso8H6T5vZITPbEzzeWfhyz057S4o9B/qZykyHXYqISN7l0kKfAj7u7pcAlwMfNrNLgvc+5+7bg8dDBasyT9pb1zA6meG5I0NhlyIikndLBrq7H3b3ruD5ENkJojcVurBCmLlRV+d+3ddFRKJnWX3oZtZKdn7Rx4NVt5rZk2Z2l5ml8lxb3p3bWMU5DVXs6ukPuxQRkbzLOdDNrA74DvAxdx8EvgJcAGwHDgN/tcjnbjGzTjPr7O3tzUPJK2dmtKdTdOnEqIhEUE6BbmZJsmF+r7vfD+DuR9094+7TwN8Cly30WXff6e4d7t7R3Nycr7pXrD2d4lD/KIcHRsMuRUQkr3IZ5WLAncA+d//snPUb52x2I/B0/svLP014ISJRlUsL/SrgJuCaM4Yo/rmZPWVmTwK/Avx+IQvNl0vObaAqGVOgi0jkJJbawN0fA2yBt4p+mOJCkvEYb9rcpH50EYmcsrlSdK6OdIpnXhlkdCITdikiInlTloHenk4xNe08eVDDF0UkOsoy0He0BBcYqdtFRCKkLAN9TW0F5zfXqh9dRCKlLAMdsjfq2tXTh7uHXYqISF6UbaB3tKboH5nkpePDYZciIpIXZRvosxcYacILEYmIsg3089fV0Vid1AVGIhIZZRvosZjR1tLErh4FuohEQ9kGOkBH6xpeOHaK/pGJsEsRETlrZR3obcF49C610kUkAso60LdtaSQeM/Wji0gklHWg11QkuGRjgwJdRCKhrAMdssMX9x4YYDIzHXYpIiJnRYGeTjE6mWHf4cGwSxEROSsKdM1gJCIRUfaBfm5TNRsbqxToIlLycplTdIuZPWpmz5rZM2b20WD9GjN72MyeD5apwpdbGO3plO68KCIlL5cW+hTwcXe/BLgc+LCZXQJ8EnjE3bcCjwSvS1J7OsUrA2O80j8adikiIiu2ZKC7+2F37wqeDwH7gE3A9cA9wWb3ADcUqshCUz+6iETBsvrQzawV2AE8Dmxw98PBW0eADXmtbBW9fmMD1cm4Al1ESlrOgW5mdcB3gI+5+7wxfp6dJWLBmSLM7BYz6zSzzt7e3rMqtlCS8RjbtjTqFgAiUtJyCnQzS5IN83vd/f5g9VEz2xi8vxE4ttBn3X2nu3e4e0dzc3M+ai6I9nSKZ14ZZGRiKuxSRERWJJdRLgbcCexz98/Oeet7wM3B85uBB/Jf3uppT6fITDt7DwyEXYqIyIrk0kK/CrgJuMbM9gSPdwK3A9ea2fPAO4LXJUt3XhSRUpdYagN3fwywRd5+e37LCU9TTQUXrq/TiVERKVllf6XoXO0tKXZ19zE9veD5XRGRoqZAn6M9nWJgdJKXjp8KuxQRkWVToM/RpguMRKSEKdDnuKC5lqaapAJdREqSAn0OM6O9JUWnAl1ESpAC/Qxt6RQv9Q5zcngi7FJERJZFgX6GmRt17dZ4dBEpMQr0M2zb3EQiZupHF5GSo0A/Q3VFnEvPbVA/uoiUHAX6AtrSKfYe6GcyMx12KSIiOVOgL6A9nWJ8appnXxlcemMRkSKhQF+AZjASkVKkQF/AxsZqNjVVK9BFpKQo0BfRlk7R2X2S7GRMIiLFT4G+iPaWJo4OjvPKwFjYpYiI5ESBvoj29BpA/egiUjoU6It4/cZ6qpNxuhToIlIicplT9C4zO2ZmT89Z92kzO3TGlHSRkojH2L6lic7uk2GXIiKSk1xa6HcD1y2w/nPuvj14PJTfsopDezrFvsNDDI9PhV2KiMiSlgx0d/8xUJbN1PZ0isy0s/dgf9iliIgs6Wz60G81syeDLpnUYhuZ2S1m1mlmnb29vWexu9XX1pL9z1I/uoiUgpUG+leAC4DtwGHgrxbb0N13unuHu3c0NzevcHfhaKxJsnV9nW7UJSIlYUWB7u5H3T3j7tPA3wKX5bes4tGeTtHV3cf0tC4wEpHitqJAN7ONc17eCDy92Lalri2dYnBsihd7T4VdiojIa0ostYGZ3Qe8DVhnZgeB/wy8zcy2Aw7sB363gDWGqmPOjbq2bqgPuRoRkcUtGeju/v4FVt9ZgFqK0nnraknVJOns7uN9l7WEXY6IyKJ0pegSzGy2H11EpJgp0HPQlk7x0vFhTg5PhF2KiMiiFOg56Ahu1KVWuogUMwV6Dt60uZFEzDQeXUSKmgI9B1XJOJdualQLXUSKmgI9R+0tKfYe7GdiajrsUkREFqRAz1FHa4rxqWmePTwYdikiIgtSoOeoPbjAqHN/Wd54UkRKgAI9RxsaqtjUVM3/euowp3R/dBEpQgr0ZfjI2y9k74F+rv/iY7xwbCjsckRE5lGgL8N739zC1//DWxgYneT6L/6Eh546HHZJIiKzFOjLdOUF63jwtrdy0Tn1fOjeLv70wWeZzGjki4iET4G+Auc0VvGNW67g313Zyh2Pvcxv3fE4x4bGwi5LRMqcAn2FKhIxPv3uS/nC+7bz1MEB3vXXj/H/NAJGREKkQD9L12/fxHc/fCU1FXHev/Pn3PXYy7hrdiMRWX0K9Dy4+JwGvnfb1Vxz8Xr+5MFnue2+3QxraKOIrLIlA93M7jKzY2b29Jx1a8zsYTN7PlimCltm8WuoSvLVm9r5xHUX89BTh7nhSz/RtHUisqpyaaHfDVx3xrpPAo+4+1bgkeB12TMzfu9tF/C1D7yFk8MTXP/Fn/CPGtooIqtkyUB39x8DZ57tux64J3h+D3BDnusqaVdduI4HP3I1F66v4/fu7eIzD+1jSkMbRaTAVtqHvsHdZ5qeR4ANeaonMjY2VvPN372cmy5Ps/PHL/Hbdz5O79B42GWJSISd9UlRzw7pWHRYh5ndYmadZtbZ29t7trsrKZWJOP/lhjfw2d/cxp4D/bzrb/4vu7o1tFFECmOlgX7UzDYCBMtji23o7jvdvcPdO5qbm1e4u9L2622b+e6HrqIqGee9X/05d/9EQxtFJP9WGujfA24Ont8MPJCfcqLr9Rsb+N6tV/O2i5r59Pef5WPf3MPIhIY2ikj+5DJs8T7gZ8BFZnbQzD4A3A5ca2bPA+8IXssSGquT7Lypgz/81xfx/b2vcOOXfspLGtooInliq/nTv6Ojwzs7O1dtf8XsseeP85Fv7GZyapq/+I1tXPeGc8IuSUSKlJntcveOpbbTlaIhuXrrOr5/29Wc31zLB7++i9v/8TkNbRSRs6JAD9Gmpmq+9cEr+LdvaeG///OL/M5dT3D8lIY2isjKKNBDVpmI85kb38hf/sY2dnX38a6/foyunr6wyxKREqRALxLvad/M/R+6kmTCeO9Xf8bXfrZfQxtFZFkU6EXk0nMbefDWt/LWrc38pwee4Q++tZfRiUzYZYlIiVCgF5nGmiR3/E4HH7/2dfzDnkPc+OWfsP/4cNhliUgJUKAXoVjMuO3tW7n731/GkcExfu2Lj/Hws0fDLktEipwCvYj9y9c18/1br6Z1bS3/8X908hf/+zky0+pXF5GFKdCL3JY1NfzPD17B+y/bwpcefZGb73qCExraKCILUKCXgKpknP/262/iz//Nm3hi/0l+7W8e474nevjl0SGm1WIXkUAi7AIkd7/55i1ccm4Dt/59F5+6/ykA6isTbG9pYkdLiraWJnZsSdFYkwy5UhEJg+7lUoLcnZePD9PV08/unj66evr5xZFBZhrrFzTX0taSoi2doq0lxdb1dcRiFm7RIrJiud7LRYEeEafGp3jyQD9dPX3s7sku+0YmgWwrftuWpmwLPp1ix5YmmmoqQq5YRHKVa6CryyUi6ioTXHnhOq68cB2QbcXvPzFCV3ffbMh/8dEXZlvx58+04ltStKWb2Lq+nrha8SIlTS30MjI8PsXeg/3sntNVc3J4AsgeELZtaZwN+R0tasWLFAu10OVVaisTXHnBOq684HQrvvvEyLxumi//6MXZse7nr6vNnmxNN9HWkuJ1G9SKFylmaqHLPCMTU+w9MDAb8rt7+jgRtOJrK+JBX3w25LdvSbGmVq14kUJblRa6me0HhoAMMJXLDqW41VQkuOKCtVxxwVog24rvOZltxXd197P7QB9f+efTrfh1dRWk19bSuraW1rU1pNcFy7W1NFZr+KTIaspHl8uvuPvxPPwdKUJmRnptLem1tdy4YzOQbcU/eXCAvQf6efn4MPtPDPPTF4/zna6xeZ9N1SRJr63lvHW1pNfW0Lr29DKllr1I3qkPXZatpiLB5eev5fLz185bPzqRoefkCPtPDNN9Ypj9J0boPjHMEy+f5B/2HGJu715jdXK2JT+7XJddrq2twEx99SLLdVZ96Gb2MtAHOPBVd9+5wDa3ALcAtLS0tHd3d694f1K6xiYzHOwbYf/xmcDPLvefGOZQ3yhz72BQX5kgve6MsA+eN9dXKuyl7KzKhUVmtsndD5nZeuBh4DZ3//Fi2+ukqCxkYmqag30jsyE/G/bHhznQNzrvDpM1FfEzgr6GlrU1nNNQxYaGKmor9aNTomdVToq6+6FgeczMvgtcBiwa6CILqUjEOL+5jvOb61713mRmmlf6R2e7b/Yfzy5/cXSIH+w7ymRmfoOktiLOhoYqmusr2dBQxfqZZUMl6+uzyw0NVdQp+CWCVvx/tZnVAjF3Hwqe/yvgT/JWmQiQjMdmT8pC87z3MtPOK/2jHDg5wtGhMY4OjnNscJyjQ2P0Do6z92A/RwfHGJucftXfra2Isz4I/PUNVWyor5wN+7kHg7rKhLp4pGScTTNlA/Dd4H/2BPD37v5PealKJAfxmLFlTQ1b1tQsuo27MzQ+xbHBsdmwPzY4ng3/4PlTB/v5weA4o5Ovnr+1piJ+OvRnW/ynW/vr66vY0KDgl+Kw4kB395eAbXmsRSTvzIyGqiQNVUkuXF+/6Hang388G/5D4xyduwyC/+giwV+djNNcX0lTTZLG6iQN1UmaqrPPZx5NNdn12ecVNFYnqa2I60AgeaOORBHODP5X9+XPcHdOjU/Na+EfC7p7jp8aZ2B0kv6RSQ71jWafj06+5rSBiZjNBn5DEPqN1Qs/Zg4CMweHqmS8EF+FlDAFusgymBn1VUnqlwj+Ge7O8ESGgdFJBkYm6R+dYHB0cjb4B0bnP04OT/Dy8WH6RyYZHJvktQahVSRipwN+TvA3VCeprYxTV5mkrjJObWWC2soE9cGytjJBXWWCuqoENcm47pUfIQp0kQIys2x4VibY1FS9rM9OT2e7gQbPCP/+0YnTB4E5648MjvHckSGGxiYZnsjkPKF4bUU29OuqsnXWVgQHgKoEtcEBoa4i+/7MwWD2oFCZCA4e2XXJuGa1DJMCXaRIxeZ0x2xZs7zPujtjk9OcGp9ieHyKU8FjeHaZ4dT4JKfGMwwH64eC5fD4FIf6R+d9bmLq1SOFFlKZiM2Gezb049RUZIO/pmLm10KwDA4ctYutr4xTEY/pHMMyKNBFIsjMqK6IU12RPVl7tiYz09nQH5tieGLmwJDh1Njcg8SZB44MIxNT9I9MzDtADI9Pkevc5omYzQv9mgUOEtnlnAPC3INERWL2e6hJZpeViegeJBToIrKkZDxGU01FXiY9cXfGp6aDXwOZ2QPE8MTpXwuvej3zPFieODXC8MQUI+MZTo1PMZ7jL4gZ1UG4z1sm41TNCf6qYF1NxfzX1RUxqpOJeZ+rrohRXZGYfV2ZiIVybkKBLiKrysyoSmYDcu3S55VzMpWZZngi+4tg9kAR/CIYncwwNplhdCLDyGSGsYkMo5PBY2Ka0ckpRoN1A6OTHBkYnX1vbDL7N3P9RTFXVTJGTRDyVckYn7nxjbzljBva5ZsCXURKXiIeo7E6VpB78Ls7E5lpxiam5xwIMsGB4PS6seCAMjo5PXsQGZmYmj0w1FcVfn4ABbqIyGswMyoTcSoTcRop7klbNMZIRCQiFOgiIhGhQBcRiQgFuohIRCjQRUQiQoEuIhIRCnQRkYhQoIuIRIT5a91wOd87M+sFuldth4WxDjgedhFFRN/Hafou5tP3Md/ZfB9pd29eaqNVDfQoMLNOd+8Iu45ioe/jNH0X8+n7mG81vg91uYiIRIQCXUQkIhToy7cz7AKKjL6P0/RdzKfvY76Cfx/qQxcRiQi10EVEIkKBniMz22Jmj5rZs2b2jJl9NOyawmZmcTPbbWYPhl1L2Mysycy+bWbPmdk+M7si7JrCYma/H/wbedrM7jOzqrBrWk1mdpeZHTOzp+esW2NmD5vZ88EyVYh9K9BzNwV83N0vAS4HPmxml4RcU9g+CuwLu4gi8QXgn9z9YmAbZfq9mNkm4CNAh7u/AYgD7wu3qlV3N3DdGes+CTzi7luBR4LXeadAz5G7H3b3ruD5ENl/sJvCrSo8ZrYZ+FXgjrBrCZuZNQL/ArgTwN0n3L0/3KpClQCqzSwB1ACvhFzPqnL3HwMnz1h9PXBP8Pwe4IZC7FuBvgJm1grsAB4Pt5JQfR74I2B5061H03lAL/B3QRfUHWZWG3ZRYXD3Q8BfAj3AYWDA3f9PuFUVhQ3ufjh4fgTYUIidKNCXyczqgO8AH3P3wbDrCYOZvQs45u67wq6lSCSANuAr7r4DGKZAP6mLXdA3fD3Zg9y5QK2Z/Xa4VRUXzw4tLMjwQgX6MphZkmyY3+vu94ddT4iuAt5tZvuBbwDXmNnXwy0pVAeBg+4+84vt22QDvhy9A3jZ3XvdfRK4H7gy5JqKwVEz2wgQLI8VYicK9ByZmZHtI93n7p8Nu54wufun3H2zu7eSPeH1Q3cv21aYux8BDpjZRcGqtwPPhlhSmHqAy82sJvg383bK9ATxGb4H3Bw8vxl4oBA7UaDn7irgJrKt0T3B451hFyVF4zbgXjN7EtgOfCbkekIR/Er5NtAFPEU2Y8rqilEzuw/4GXCRmR00sw8AtwPXmtnzZH/F3F6QfetKURGRaFALXUQkIhToIiIRoUAXEYkIBbqISEQo0EVEIkKBLiISEQp0EZGIUKCLiETE/wffoE049xSSCwAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(range(1,11), inertia_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The elbow method tells us to look for where the curve straightens into a line. That point is the suggested number of clusters.\n", "\n", "We'll try this approach to determine the cluster number for the labor market data. Let's load in and clean the labor market data from the previous labs." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.4/site-packages/ipykernel_launcher.py:1: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support skipfooter; you can avoid this warning by specifying engine='python'.\n", " \"\"\"Entry point for launching an IPython kernel.\n" ] } ], "source": [ "labor = pd.read_csv(\"Nov2019_labor_market_majors.csv\", skiprows = 13, \\\n", " skipfooter = 3, index_col = \"Major\")\n", "labor[\"Median Wage Early Career\"] = labor[\"Median Wage Early Career\"].str.replace(\",\",\"\").astype(float)\n", "labor[\"Median Wage Mid-Career\"] = labor[\"Median Wage Mid-Career\"].str.replace(\",\",\"\").astype(float)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "labor.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new dataframe with the scaled data." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "labor_scaled = scaler.fit_transform(labor)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "labor_scaled = pd.DataFrame(labor_scaled,columns = labor.columns, index = labor.index)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run k-means clustering on the scaled data with k = 4." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 2, 2, 1, 0, 2, 2, 2, 2, 1, 1, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1,\n", " 1, 1, 1, 1, 0, 2, 0, 2, 0, 0, 0, 1, 2, 0, 2, 0, 3, 0, 0, 0, 0, 0,\n", " 2, 2, 3, 0, 1, 0, 0, 2, 2, 1, 2, 2, 2, 2, 0, 2, 1, 1, 0, 2, 1, 2,\n", " 1, 2, 1, 0, 0, 1, 2], dtype=int32)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kmeans = KMeans(n_clusters = 4)\n", "kmeans_clusters = kmeans.fit_predict(labor_scaled)\n", "kmeans_clusters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute the inertia for 4 clusters." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7.4987891124995745" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kmeans.inertia_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is the inertia if there is only 1 cluster? What is the inertia if every data point is its own cluster?\n", "\n", "Compute the inertia for all values of k between 1 and 10 using a loop." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "inertia_list = []\n", "for k in range(1,16):\n", " kmeans = KMeans(n_clusters = k)\n", " kmeans_clusters = kmeans.fit_predict(labor_scaled)\n", " inertia_list.append(kmeans.inertia_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, plot the inertias as a line graph." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Text(0,0.5,'inertia')" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEKCAYAAAAB0GKPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xl8VfWd//HXJ/tCAoQsLIGyyuYGTQE36kqpu/46rc6M1enC2NHWTjszbX9TW0c7nf7aWutUp9Vfa7Xzs9aOK7Uq4opaRAMCsskmSEIgwUAChC3J5/fHPQkhJOQGcnNy730/H4/7uPd8zzk3H3yA75zzOed8zd0RERHpSkrYBYiISHxQYIiISFQUGCIiEhUFhoiIREWBISIiUVFgiIhIVBQYIiISFQWGiIhERYEhIiJRSQu7gJ5UWFjoI0eODLsMEZG4sXjx4h3uXhTNtgkVGCNHjqS8vDzsMkRE4oaZbY52W52SEhGRqMQsMMxsuJm9YmarzGylmd0SjBeY2XwzWxe8D+xk/+uDbdaZ2fWxqlNERKITyyOMRuCb7j4JmAHcZGaTgG8DL7n7OOClYPkIZlYAfB+YDkwDvt9ZsIiISO+IWWC4e5W7Lwk+7wZWA8OAK4CHgs0eAq7sYPdPAfPdvdbddwLzgdmxqlVERLrWKz0MMxsJTAEWASXuXhWs2gaUdLDLMGBLm+WKYExEREIS88Aws37A48DX3b2+7TqPzN50QjM4mdkcMys3s/KampoT+SoRETmGmAaGmaUTCYuH3f2JYHi7mQ0J1g8BqjvYtRIY3ma5NBg7irvf7+5l7l5WVBTVpcQiInIcYnmVlAG/AVa7+8/arJoLtFz1dD3wdAe7zwNmmdnAoNk9KxjrcQcam7jvtQ28vk5HJyIixxLLI4yzgOuA881safC6GPgRcJGZrQMuDJYxszIz+zWAu9cCdwDvBK/bg7Eel5Gawn0LNvLkux0ewIiISCBmd3q7+xuAdbL6gg62Lwe+1Gb5AeCB2FR3mJkxY3QBizbW4u5EDoxERKQ93ekNTB81iMpd+6jYuS/sUkRE+iwFBjBj9CAAFm78KORKRET6LgUGMK64HwW5GbylwBAR6ZQCA0hJMaaPOtzHEBGRoykwAjNGq48hInIsCoxASx9Dp6VERDqmwAgc7mPE5HYPEZG4p8AItPQxdIQhItIxBUYbLX2MLbUNYZciItLnKDDaUB9DRKRzCow21McQEemcAqMN9TFERDqnwGhHfQwRkY4pMNqZProAUB9DRKQ9BUY7JxXnMTAnXX0MEZF2FBjtRPoYg3SEISLSjgKjAzNGF6iPISLSjgKjAzPG6H4MEZH2FBgdaOljLPpAfQwRkRYKjA6ojyEicrSYBYaZPWBm1Wa2os3Yo2a2NHhtMrOlney7yczeC7Yrj1WNxzJjdAEVO9XHEBFpkRbD734QuAf4XcuAu3+u5bOZ3QnUHWP/89x9R8yq60JLH2PRB7UML8gJqwwRkT4jZkcY7r4A6LAJYGYGfBZ4JFY//0Qdvh9Dp6VERCC8HsY5wHZ3X9fJegdeMLPFZjanF+tqpT6GiMiRwgqMazn20cXZ7j4V+DRwk5nN7GxDM5tjZuVmVl5TU9OjRaqPISJyWK8HhpmlAVcDj3a2jbtXBu/VwJPAtGNse7+7l7l7WVFRUY/W2raPISKS7MI4wrgQWOPuFR2tNLNcM8tr+QzMAlZ0tG2snVScxwD1MUREgNheVvsIsBAYb2YVZvbFYNU1tDsdZWZDzezZYLEEeMPMlgFvA3929+djVeexaH4MEZHDYnZZrbtf28n4DR2MbQUuDj5vBE6LVV3dNWP0IOat3M6W2gZdXisiSU13enehZZ5v9TFEJNkpMLowvkR9DBERUGB0SX0MEZEIBUYUZoweRMXOfVTs1P0YIpK8FBhRaO1jaNpWEUliCowoqI8hIqLAiEprH+MDBYaIJC8FRpRmjB7Ellr1MUQkeSkwoqQ+hogkOwVGlNTHEJFkp8CIUkqKMW2k+hgikrwUGN2gPoaIJDMFRjeojyEiyUyB0Q0TBufRP1t9DBFJTgqMbtD9GCKSzBQY3dTSx6jctS/sUkREepUCo5sO9zF0lCEiyUWB0U3qY4hIslJgdNPh+TF0pZSIJBcFxnGYMXoQH9Y2qI8hIkklZoFhZg+YWbWZrWgzdpuZVZrZ0uB1cSf7zjaz981svZl9O1Y1Hi/1MUQkGcXyCONBYHYH43e5++nB69n2K80sFbgX+DQwCbjWzCbFsM5uUx9DRJJRzALD3RcAx3Oifxqw3t03uvtB4A/AFT1a3AlSH0NEklEYPYybzWx5cMpqYAfrhwFb2ixXBGMdMrM5ZlZuZuU1NTU9XWunpquPISJJprcD45fAGOB0oAq480S/0N3vd/cydy8rKio60a+L2ozRBYD6GCKSPHo1MNx9u7s3uXsz8H+JnH5qrxIY3ma5NBjrUyYOzlcfQ0SSSq8GhpkNabN4FbCig83eAcaZ2SgzywCuAeb2Rn3dkZJiTFMfQ0SSSCwvq30EWAiMN7MKM/si8GMze8/MlgPnAf8YbDvUzJ4FcPdG4GZgHrAa+KO7r4xVnSdC92OISDJJi9UXu/u1HQz/ppNttwIXt1l+Fjjqktu+pm0f4+qppSFXIyISW7rT+wSojyEiyUSBcQJa+hiLPlAfQ0QSnwLjBM0YPYjNHzWwVX0MEUlwCowT1NrH0Cx8IpLgFBgnqLWPsUGnpUQksSkwTlDr/Rg6whCRBKfA6AHqY4hIMlBg9IDpo9THEJHEp8DoAROH5JOflaY+hogkNAVGD0hNMaaNGqQ+hogkNAVGD5kxukB9DBFJaAqMHtI6z7eOMkQkQSkweoj6GCKS6BQYPUR9DBFJdAqMHtTSx6iqUx9DRBKPAqMHtfYxNAufiCQgBUYPau1jaH4MEUlACowe1NrHUGCISAJSYPSwGaML2KQ+hogkIAVGD1MfQ0QSVcwCw8weMLNqM1vRZuwnZrbGzJab2ZNmNqCTfTeZ2XtmttTMymNVYyxMHJJPnvoYIpKAYnmE8SAwu93YfOBkdz8VWAt85xj7n+fup7t7WYzqi4nUFGP6qAIFhogknJgFhrsvAGrbjb3g7o3B4ltAaax+fphmjB6kPoaIJJwwexhfAJ7rZJ0DL5jZYjObc6wvMbM5ZlZuZuU1NTU9XuTxUB9DRBJRKIFhZv8KNAIPd7LJ2e4+Ffg0cJOZzezsu9z9fncvc/eyoqKiGFTbfepjiEgi6vXAMLMbgEuBv3F372gbd68M3quBJ4FpvVZgD1AfQ0QSUVq0G5rZJcBkIKtlzN1v784PM7PZwL8An3T3hk62yQVS3H138HkW0K2f0xfMGD2IF1dXU1W3jyH9s8MuR0TkhEV1hGFmvwI+B3wVMOCvgI91sc8jwEJgvJlVmNkXgXuAPGB+cMnsr4Jth5rZs8GuJcAbZrYMeBv4s7s/3/0/WrhmnhQ5PfbQXzaHXImISM+wTs4KHbmR2XJ3P7XNez/gOXc/J/YlRq+srMzLy/vObRv//D/LePLdSp675RzGleSFXY6IyFHMbHG0ty9E28NouT60wcyGAoeAIcdTXDL59qcnkJuZxnefWkE0wSwi0pdFGxjPBHdl/wRYAmwCHolVUYliUL9MvjV7Aos+qOWppZVhlyMickKiCgx3v8Pdd7n740R6FxPc/dbYlpYYrvnEcE4fPoB///Nq6hoOhV2OiMhxO2ZgmNn5wfvVLS/gEuCC4LN0ISXF+MGVJ1O79yA/feH9sMsRETluXV1W+0ngZeCyDtY58ESPV5SATh7Wn8+fMZKHFm7ir8pKObW0w2cuioj0adFeJTXK3T/oaixsfe0qqbbq9x/igjtfY3B+Fk/ddBapKRZ2SSIiMblK6vEOxh6LviTJz0rn1ksn8V5lHb9fpHszRCT+HPOUlJlNIHJ3d/92PYt82tzxLdG57NQhPPrOh/x43vvMPnkIRXmZYZckIhK1ro4wxhN57tMAIn2MltdU4MuxLS3xmBm3X3EyBw4188NnV4ddjohItxzzCMPdnzazZ4BvufsPe6mmhDamqB9//8nR/OLl9Xy2bDhnjBkUdkkiIlHpsofh7k3Alb1QS9K46byxDC/I5tanV3CwsTnsckREohJt0/tNM7vHzM4xs6ktr5hWlsCy0lP5t8sns756D79+Y2PY5YiIRCXax5ufHry3fcy4A+f3bDnJ4/wJJXxqcgn/+dI6Lj9tKKUDc8IuSUTkmKJ9NMh5HbwUFifoe5dNxjD+7U+rwi5FRKRL0c6HUWJmvzGz54LlScH8FnIChg3I5pYLxzF/1XZeXLU97HJERI4p2h7Gg8A8YGiwvBb4eiwKSjZfPHsU44r78f25K9l3sCnsckREOhVtYBS6+x+BZgB3bwT0f7cekJ6awg+uPJnKXfu455V1YZcjItKpaANjr5kNItLoxsxmAHUxqyrJTB89iKunDuP+BRtZX7077HJERDoUbWB8A5gLjDGzN4HfEZnfW3rI/754Itnpqdz61ErNzicifVK0V0ktIfKo8zOBvwcmu/vyrvYzswfMrNrMVrQZKzCz+Wa2Lngf2Mm+1wfbrDOz66P748Svwn6Z/MvsCSzc+BFzl20NuxwRkaNEe4QBMA04jchzpK41s89Hsc+DwOx2Y98GXnL3ccBLwfIRzKwA+D4wPfi53+8sWBLJtdNGcFppf+54ZjX1+zU7n4j0LdFeVvvfwE+Bs4FPBK8un5/u7guA2nbDVwAPBZ8fouPHjnwKmO/ute6+E5jP0cGTcFJTjB9ceQq1ew9w5zzNzicifUu0d3qXAZO8Z06ul7h7VfB5G1DSwTbDgC1tliuCsYR3Sml/rpvxMf77rc185uPDOaW0f9gliYgA0Z+SWgEM7ukfHgTQCYWQmc0xs3IzK6+pqemhysL1jVnjKcjN5LtPvUdTsxrgItI3RH0fBrDKzOaZ2dyW13H+zO1mNgQgeK/uYJtKYHib5dJg7Cjufr+7l7l7WVFR0XGW1Lf0z07nu5dMZFlFHY+8/WHY5YiIANGfkrqtB3/mXOB64EfB+9MdbDMP+GGbRvcs4Ds9WEOfd8XpQ3n0nS38+Pk1zD55MIX9NDufiIQr2stqX+vo1dV+ZvYIsBAYb2YVwfOnfgRcZGbrgAuDZcyszMx+Hfy8WuAO4J3gdXswljTMjDuunMy+Q038x7Nrwi5HRAQ7Vh/bzN5w97PNbDdH9hqMSAsiP9YFdkdZWZmXl5eHXUaP+vHza/ivVzfw6JwZTB+t2flEpGeZ2WJ37/KqV+jiCMPdzw7e89w9v80rr6+FRaL66vnjGDYgm+8+tYJDTZqdT0TC050b9yQE2RmR2fnWVe/hgTc+CLscEUliCow4cOGkEi6cWMLPX1xH5a59YZcjIklKgREnbrt8Eo5z+59Whl2KiCQpBUacKB2Yw9cuGMe8ldt5eY1m5xOR3qfAiCNfOns0Y4pyufWplWz+aG/Y5YhIklFgxJGMtBTu/Ozp7DnQyOX3vMkb63aEXZKIJBEFRpw5ffgA5t58FoPzs/j8A4v49esbNeGSiPQKBUYc+tigXJ74hzO5aFIJP/jzar75P8vYf0hTrItIbCkw4lRuZhq//JuP848XnsQTSyr53H0L2Va3P+yyRCSBKTDiWEqKccuF47jvuo+zvnoPl93zBos37wy7LBFJUAqMBPCpyYN58qazyMlI5dr73+LRd/RIdBHpeQqMBHFSSR5P33QW00cX8K3H3+P7T+vZUyLSsxQYCWRATga/veETfPmcUTy0cDPX/WYRtXsPhl2WiCQIBUaCSUtN4V8vmcRdnzuNJR/u4rJfvMHKrXVhlyUiCUCBkaCumlLKYzeeQVOz85lfLuTPy6vCLklE4pwCI4GdWjqAuV89i0lD87np90v4ybw1NDfrJj8ROT4KjARXnJfF7788nWunDefeVzbw5d+VU7//UNhliUgcUmAkgcy0VH541SncceXJvLa2hivvfZONNXvCLktE4owCI0mYGdfN+BgPf2k6uxoOccW9b/LK+9VhlyUicaTXA8PMxpvZ0javejP7erttzjWzujbbfK+360xU00cPYu7NZzF8YA5fePAdfvnqBj28UESiktbbP9Dd3wdOBzCzVKASeLKDTV9390t7s7ZkUTowh8e/cib//Ngy/s/za1hVVc+P/9epZGekhl2aiPRhYZ+SugDY4O6bQ64j6WRnpPKLa6fwrdkTeGb5Vj7zq79QsbMh7LJEpA8LOzCuAR7pZN0ZZrbMzJ4zs8m9WVSyMDO+cu4YHrj+E3xY28Cn7lrAfa9t4GCjHikiIkezsM5fm1kGsBWY7O7b263LB5rdfY+ZXQzc7e7jOvmeOcAcgBEjRnx882YdrByPDz9q4PZnVvLi6mpGF+byvcsmce744rDLEpEYM7PF7l4W1bYhBsYVwE3uPiuKbTcBZe5+zDlJy8rKvLy8vIcqTE6vvF/N7X9axQc79nLhxGK+e8kkRhbmhl2WiMRIdwIjzFNS19LJ6SgzG2xmFnyeRqTOj3qxtqR13vhi5n19Jt/+9AQWbviIWXct4Cfz1rD3QGPYpYlIyEIJDDPLBS4CnmgzdqOZ3RgsfgZYYWbLgP8ErnFd+9lrMtJSuPGTY3j5n87l0lOHcO8rG7jgztd4emmlLsEVSWKhnZKKBZ2Sio3yTbXc9qeVrKisZ9rIAm67fDKThuaHXZaI9IB4OSUlcaJsZAFP33Q2/3H1Kayv2cOlv3idW59awU7NtSGSVBQYEpXUFOPaaSN45Zvn8vkzRvLwos2cd+er/Pdbm2nSE3BFkoICQ7qlf046t10+mWdvOYcJg/O49akVXPqLN3j7g9qwSxORGFNgyHGZMDifR748g3v/eip1DQf57H0LueUP77Ktbn/YpYlIjCgw5LiZGZecOoQXv/lJvnb+WJ5bsY3z73yV/3p1PQcam8IuT0R6mAJDTlhORhrfmDWel77xSc4eW8iPn3+fWXct4KXV27veWUTihgJDeszwghzu/3wZv/vCNNJSjC8+VM7f/fZt1lfvDrs0EekBCgzpcTNPKuK5W2by3Usm8s6mncy6awHfeHQpmz/aG3ZpInICdOOexFTt3oPc99oGHlq4icYm56/KhvPV88cydEB22KWJCHHy8MFYUGD0XdX1+7n3lfX8/u0PMYy/nj6CfzhvDMV5WWGXJpLUFBjSZ1XsbOCel9fzP4srSE81rj9zJDfOHMPA3IywSxNJSgoM6fM27djL3S+t46mlleRmpPGFs0fxpXNGkZ+VHnZpIklFgSFxY+323fz8xbU8+942+menM2fmaG44cyS5mb0+3bxIUlJgSNxZUVnHz+av5eU11QzKzeAr547hb2d8jKz01LBLE0loCgyJW4s37+Rn89/nzfUfMTg/i5vPH8tny4aTkaYrwEViQYEhce8vG3Zw5wtrWbx5J6UDs7nlgnFcNWUYaakKDpGepPkwJO6dOaaQx248gwf/7hMMzMngnx9bzqy7FjB32Vaa9Th1kVAoMKTPMjPOHV/M3JvP4r7rPk56agpfe+RdLv7P13l+xTbNwyHSy3RKSuJGc7PzzHtV/Hz+Wjbu2EtJfiZXThnG1VNKGT84L+zyROKSehiS0Bqbmpm3cjtPLKng1bU1NDU7k4fmc/XUUi4/bShFeZlhlygSN+IiMMxsE7AbaAIa2xdsZgbcDVwMNAA3uPuSY32nAiP57NhzgD8t28oTSyp5r7KO1BRj5rhCrppayqxJJbosV6QL8RQYZe6+o5P1FwNfJRIY04G73X36sb5TgZHc1m3fzRPvVvLUu5VU1e0nLzONi08ZwlVThzFtZAEpKRZ2iSJ9TqIExn3Aq+7+SLD8PnCuu1d19p0KDAFoanYWbfyIx5dU8vyKKvYebGLYgGyumjKMq6YOY0xRv7BLFOkzuhMYYT5/wYEXzMyB+9z9/nbrhwFb2ixXBGOdBoYIQGqKcebYQs4cW8gdV07mhZXbeeLdSv7r1fXc88p6Th8+gKunDuOyU4fqoYci3RBmYJzt7pVmVgzMN7M17r6gu19iZnOAOQAjRozo6RolzuVkpHHllGFcOWUY2+v38/TSSp5YUsn3nl7JHc+s4rzxxVw9dRjnTSgmM039DpFj6RNXSZnZbcAed/9pmzGdkpKYWbW1niffreCppVup2X2A/tnpXHrqEK44fRhTRwzQHeWSNPr8KSkzywVS3H138HkWcHu7zeYCN5vZH4g0veuOFRYi3TFpaD6Thk7iW7Mn8Mb6HTz5biWPL6ng4UUfkpeVxtljC5l5UhEzTypimGYHFAHCOyVVAjwZuXKWNOD37v68md0I4O6/Ap4lcoXUeiKX1f5dSLVKAktLTeHc8cWcO76Y3fsPsWDtDhasrWHBuhqeW7ENgDFFua3hMWPUILIzdOpKklOfOCXVU3RKSnqKu7O+eg+vra1hwbodLNr4EQcam8lIS2HayAJmnhQ5Ahlfkkfwi49IXIqLy2pjQYEhsbL/UBNvf1DbevSxdvseAEryMzlnXBHnjCvknHFFFOiqK4kzfb6HIRJvstJTW09LAVTV7eP1tTt4bV0N81dt57HFFZjBKcP6M3NcZLspIwaQrua5JBAdYYicoKZmZ3nFrkj/Y10N7364k2aHvMw0zhgzKNL7GF3AqMJ+pOpuc+ljdEpKJER1+w7xl/WR8FiwdgeVu/YBkJWewvjB+Uwakh+5SmtIPhMG52n+cgmVAkOkj3B3NtTsZemWXazaWs+qqjpWba2nfn8jAGYwalAuE4MAaQmT4rxMNdOlV6iHIdJHmBlji/sxtrgffDwy5u5srdsfCZAgRJZX7OLPyw/fZjQoN6P1KKTlfVRhrm4olFApMER6mZkxbEA2wwZkc9Gkktbx+v2HWFO1m1Vb61hVVc+qqnp+++YmDjY1A5CZlsL4wXmtITJxSD7DB+YwqF+GmuvSK3RKSqQPO9TUzIaaPazaWs/qIERWbq1nV8Oh1m3MoCAng6K8TIryMinOywreM1vfi/MjY/3UL5F2dEpKJEGkp6YwYXA+Ewbnt465O9vq97O6qp6quv3U7D5A9e4DVNcfoGbPATZU76BmzwEONR39y2BORmq7MMlqEzSR98H5WQzqp1kL5WgKDJE4Y2YM6Z/NkP6dP+PK3dnVcIjq3QeCQNnf5vMBanbvZ8223by+dge7DzQetX9hvwwmBk34icEpsNHqoSQ9BYZIAjIzBuZmMDA3g/GD84657b6DTdTsPkDNnv1U1x+gctc+3t+2+6geSkZaCuNL8pg4JK81SCYOzSc/K703/kjSBygwRJJcdkYqIwblMGJQzlHrWnooq6vqgz7Kbl5cXc0fyytatykdmH3EkcikIfmUDszWZcEJSIEhIp1q20O5akpkzN2p3n0gciVXm2b8/NXbabmGJi8rjYmD8yNHI8EVXSMLc8lOTyUtxRQmcUqBISLdYmaU5GdRkp/FeeOLW8cbDjby/rbdrK7azaqqOlZX7eaxxRXsXdjUbv/IJcIZqSlkpqcG74eXM9NSWl8ZaSlkph3e5oixYJuivEzGFvdjVGGuZk2MMQWGiPSInIw0powYyJQRA1vHmpudD2sbWF1Vz5adDRxsbOZgYzMHjng1HTF2sLGJPQcaqd179PqDjc3sP9REcwd3A6QYjCjIYWxxP8YU9WNMcMPk2OJ+6rP0EAWGiMRMSooxsjCXkYW5Pfq9jU3NHGxqZv+hZqrq9rG+eg8bavayoXoP66v3sGDtjtZmPUBxXiZjig4HSEuolOTrESzdocAQkbiTlppCWmoKORlQkJvB5KH9j1jf2NTMlp0tQRIJkfXVe3jq3cojLiPOy0xjdHE/xha1hEguY4v7MaIgR5cQd0CBISIJJy01hVGFuYwqzOUiDj9+paVhv6F6D+vbBMnr62p4fMnhK7/SU43B/bMYkp9NSf8sBudnMrh/NoPzsxjcP/IqzstMukeyKDBEJGm0bdifObbwiHX1+w+1ntLaULOXqrp9bKvbz/KKXbxQt58Djc3tvgsK+2UeDpE270P6ZwVBk5VQj69PnD+JiMgJyM9KP6pp36Llzvlt9fvZVrf/8Hvw+cOPGnj7g1rq9h06at+8rLTWMBnSP4vSgTkML8iOvA/MoTgvk5Q4mVir1wPDzIYDvwNKAAfud/e7221zLvA08EEw9IS7396bdYqItGh75/zEIfmdbrfvYFObUNnHtroDbK/fHzlaqT/A+9tqqN594Ih9MtJSKB2QTWlBDsMHZrcGyvCBOQwvyGFgTnqfacyHcYTRCHzT3ZeYWR6w2Mzmu/uqdtu97u6XhlCfiMhxyc5Ibe2ddGb/oSYqd+1jS20DW3buo6K2gYqd+9iys4H3Knaxs+HIo5TcjNQjj0raBUteL14y3OuB4e5VQFXwebeZrQaGAe0DQ0Qk4WSlp0buEynq1+H6PQcaqdjZwJballCJfK7Y2cBbG2vZ0+5hkQNy0jmpOI8/3nhGzGsPtYdhZiOBKcCiDlafYWbLgK3AP7n7yk6+Yw4wB2DEiBGxKVREpJf0y0w76pH2LVp6KW1DZMvOBpo6upMxBkKbQMnM+gGvAf/u7k+0W5cPNLv7HjO7GLjb3cd19Z2aQElEpHu6M4FSKBcRm1k68DjwcPuwAHD3enffE3x+Fkg3s8L224mISO/p9cCwSLv/N8Bqd/9ZJ9sMDrbDzKYRqfOj3qtSRETaC6OHcRZwHfCemS0Nxv43MALA3X8FfAb4ipk1AvuAazyRJh8XEYlDYVwl9QZwzIuK3f0e4J7eqUhERKKRXA9CERGR46bAEBGRqCgwREQkKgoMERGJSmg37sWCmdUAm8Ouo51CYEfYRURJtcZOPNUbT7VCfNXbF2v9mLsXRbNhQgVGX2Rm5dHeRRk21Ro78VRvPNUK8VVvPNXaEZ2SEhGRqCgwREQkKgqM2Ls/7AK6QbXGTjzVG0+1QnzVG0+1HkU9DBERiYqOMEREJCoKjBgws+Fm9oqZrTKzlWZ2S9g1dcXMUs3sXTN7JuxaumJmA8zsMTNbY2arzSz2U40dJzP7x+DvwAoze8TMssKuqS0ze8DMqs1sRZuxAjObb2brgveBYdbYVif1/iR+mqSbAAADk0lEQVT4u7DczJ40swFh1tiio1rbrPummXm8TdugwIiNlnnLJwEzgJvMbFLINXXlFmB12EVE6W7geXefAJxGH63bzIYBXwPK3P1kIBW4JtyqjvIgMLvd2LeBl4JJy14KlvuKBzm63vnAye5+KrAW+E5vF9WJBzm6VsxsODAL+LC3CzpRCowYcPcqd18SfN5N5H9ow8KtqnNmVgpcAvw67Fq6Ymb9gZlE5lTB3Q+6+65wqzqmNCDbzNKAHCJTDvcZ7r4AqG03fAXwUPD5IeDKXi3qGDqq191fcPeWia7fAkp7vbAOdPLfFuAu4F+AuGsgKzBirIt5y/uKnxP5C9wcdiFRGAXUAL8NTqH92sxywy6qI+5eCfyUyG+SVUCdu78QblVRKXH3quDzNqAkzGK66QvAc2EX0RkzuwKodPdlYddyPBQYMRTMW/448HV3rw+7no6Y2aVAtbsvDruWKKUBU4FfuvsUYC9965RJq+Dc/xVEQm4okGtmfxtuVd0TTFwWF78Jm9m/Ejkd/HDYtXTEzHKITBb3vbBrOV4KjBjpat7yPuQs4HIz2wT8ATjfzP5fuCUdUwVQ4e4tR2yPEQmQvuhC4AN3r3H3Q8ATwJkh1xSN7WY2BCB4rw65ni6Z2Q3ApcDf9OHZOccQ+eVhWfDvrRRYYmaDQ62qGxQYMRDNvOV9hbt/x91L3X0kkYbsy+7eZ38LdvdtwBYzGx8MXQCsCrGkY/kQmGFmOcHfiQvoow36duYC1wefrweeDrGWLpnZbCKnVC9394aw6+mMu7/n7sXuPjL491YBTA3+TscFBUZstMxbfr6ZLQ1eF4ddVAL5KvCwmS0HTgd+GHI9HQqOgh4DlgDvEfn31qfu9DWzR4CFwHgzqzCzLwI/Ai4ys3VEjpJ+FGaNbXVS7z1AHjA/+Lf2q1CLDHRSa1zTnd4iIhIVHWGIiEhUFBgiIhIVBYaIiERFgSEiIlFRYIiISFQUGCIxZGYjO3paqUg8UmCIiEhUFBgivcTMRgcPTPxE2LWIHI+0sAsQSQbBo0z+ANwQr08qFVFgiMReEZHnMV3t7n31uVciXdIpKZHYqyPyIMKzwy5E5EToCEMk9g4CVwHzzGyPu/8+7IJEjocCQ6QXuPveYLKq+UFozA27JpHu0tNqRUQkKuphiIhIVBQYIiISFQWGiIhERYEhIiJRUWCIiEhUFBgiIhIVBYaIiERFgSEiIlH5/1eVGKrSo49KAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(range(1,16),inertia_list)\n", "plt.xlabel(\"k\")\n", "plt.ylabel(\"inertia\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where do you think the elbow is for this graph?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Silhouette Score\n", "\n", "Instead of computing the inertia, which requires a cluster center, we can compute the silhouette score. \n", "\n", "First the Silhouette Coefficient is calculated for each data point. If a is the mean distance from that point to all other points in its cluster and if b is the mean distance to all other points in the nearest cluster that the point is not part of, then the Silhouette Coefficient for a data point is \n", "$$\\frac{b - a}{\\max\\{a,b\\}}$$\n", "\n", "The Silhouette Score is the mean silhouette coefficient for all data points.\n", "\n", "Again compute the k-means clusters for the iris data set with k =3." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "kmeans = KMeans(n_clusters = 3)\n", "kmeans_clusters = kmeans.fit_predict(iris_scaled)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the silhouette score as follows." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5047687565398588" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "silhouette_score(iris_scaled,kmeans_clusters)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can find the value of k giving a high (best) silhouette score by using a loop to try different values of k, similarly to the elbow method. Try doing this below." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "silhouette_score_list = []\n", "for k in range(2,11):\n", " kmeans = KMeans(n_clusters = k)\n", " kmeans_clusters = kmeans.fit_predict(iris_scaled)\n", " score = silhouette_score(iris_scaled,kmeans_clusters)\n", " silhouette_score_list.append(score)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD8CAYAAACb4nSYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmYVPWd7/H3t/eVpjfWbuhmcUHAhRaMK5poUDOYxJk8MJMZnajEqJPkmjtXTTLXGTNzY+beyZN5nkEdNIsxUeO4JGRzyQSXqCiNogiIQDc7Si9svdDr9/5Rp9u2behqqOZUV31ez1MPdU6d6vo0yud3tjrH3B0REUkOKWEHEBGRE0elLyKSRFT6IiJJRKUvIpJEVPoiIklEpS8ikkRU+iIiSUSlLyKSRFT6IiJJJC3sAP2VlJR4RUVF2DFEREaU1atX17t76WDLxV3pV1RUUF1dHXYMEZERxcy2RbOcdu+IiCQRlb6ISBJR6YuIJBGVvohIElHpi4gkEZW+iEgSUemLiCSRhCn9A60d/NuzG9lS1xR2FBGRuJUwpd/R1c39L9Vw3/Nbwo4iIhK3Eqb0S/IyWXT2JJ56cxe79reGHUdEJC4lTOkDLLlwCgD3v1gTchIRkfiUUKU/YXQ2nztzIo+8vp36praw44iIxJ2EKn2AG+dPpb2rmx+/XBt2FBGRuJNwpT+1NI8rZo7np69s4+DhjrDjiIjElYQrfYCvzJ/KobZOHno1qiuNiogkjYQs/ZkTC7jopFJ+9KdaWtu7wo4jIhI3ErL0AW6+eBoNze08Vr0j7CgiInEjYUt/bmURZ1cU8p8vbKG9szvsOCIicSFhSx/gpounsfvAYX61ZlfYUURE4kJCl/78k0qZMX4U976wha5uDzuOiEjoErr0zYybLp5KTV0zz657P+w4IiKhS+jSB7h85ngqS3JZ+vxm3LW2LyLJLeFLPzXFuPGiKbyz6yAvbqoPO46ISKgSvvQBPndmGeMLsli6YnPYUUREQhVV6ZvZAjPbaGabzez2IyzzBTNbb2brzOzhPvOvMbNNweOaWAUfioy0FG64YAqv1zZSvbUxjAgiInFh0NI3s1RgKXA5MANYbGYz+i0zHbgDOM/dTwO+HswvAu4E5gFzgTvNrDCmv0GUFs0tpyg3g3t0kxURSWLRrOnPBTa7e427twOPAlf1W+YGYKm77wNw973B/E8Dz7l7Y/Dac8CC2EQfmpyMNP723Ar++O5e1u0+EEYEEZHQRVP6E4G+1zLYGczr6yTgJDN72cxWmtmCIbwXM1tiZtVmVl1XVxd9+iH6m09UkJeZxr1a2xeRJBWrA7lpwHRgPrAYuN/MRkf7Zndf5u5V7l5VWloao0gfV5CTzhfPmczv1u6htr552D5HRCReRVP6u4DyPtNlwby+dgLL3b3D3WuB94gMAtG894S67vxK0lNT+M8XtLYvIsknmtJfBUw3s0ozywAWAcv7LfNLImv5mFkJkd09NcAzwGVmVhgcwL0smBea0vxMvlBVzhNv7GTPAd1AXUSSy6Cl7+6dwC1EynoD8Ji7rzOzu8xsYbDYM0CDma0HVgB/7+4N7t4IfIfIwLEKuCuYF6olF06h2+H+F3VLRRFJLhZvlyaoqqry6urqYf+cWx9bw+/Xvs/Lt19CUW7GsH+eiMhwMrPV7l412HJJ8Y3cgdw0fyqHO7v4iW6gLiJJJGlLf9qYfC6bMZafvLKVQ7qBuogkiaQtfYCb5k/j4OFOfv7a9rCjiIicEEld+qeXj+aC6SU88FIthzt0A3URSXxJXfoQWduvb2rjv1bvDDuKiMiwS/rSP2dKEWdOGs1/vrCFji7dQF1EElvSl76ZcfP8aezc18qv39oddhwRkWGV9KUPcMkpYzhlXD73Pr+Fbt1AXUQSmEofSEkxvjJ/Kpv2NvHchg/CjiMiMmxU+oErZ41nUlEO96zQDdRFJHGp9ANpqSnceNFU3tp5gJc3N4QdR0RkWKj0+7h6zkTG5GfqBuoikrBU+n1kpqWy5MIpvFrTwBvb94UdR0Qk5lT6/SyeO4nROencs0I3WRGRxKPS7yc3M41rz63gDxs+4N33D4YdR0QkplT6A7j23ApyMlJ1A3URSTgq/QGMzsngi+dM5tdv7WZ7Q0vYcUREYkalfwTXnV9JWkoK972otX0RSRwq/SMYOyqLP68q4/HqnXxw8HDYcUREYkKlfxQ3XjiVzu5uHnipJuwoIiIxodI/iknFOSw8fQI/f207+1vaw44jInLcVPqD+Mr8abS0d/GTV7aGHUVE5Lip9Adx8rh8PnXqWH788laa2jrDjiMiclyiKn0zW2BmG81ss5ndPsDr15pZnZmtCR7X93mtq8/85bEMf6LcdPFUDrR28IhuoC4iI1zaYAuYWSqwFLgU2AmsMrPl7r6+36K/cPdbBvgRre5+xvFHDc9Zkwo5d2ox979Uw9+cO5nMtNSwI4mIHJNo1vTnApvdvcbd24FHgauGN1b8uWn+NPYeauOJ1bvCjiIicsyiKf2JwI4+0zuDef1dbWZvm9njZlbeZ36WmVWb2Uoz++xAH2BmS4Jlquvq6qJPfwKdN62Y08sKuO+FLXTqBuoiMkLF6kDur4EKd58NPAc82Oe1ye5eBfwl8AMzm9r/ze6+zN2r3L2qtLQ0RpFiy8y46eJpbG9s4bdr94QdR0TkmERT+ruAvmvuZcG8Xu7e4O5tweQDwJw+r+0K/qwBngfOPI68obr01LFMH5PHPSt0A3URGZmiKf1VwHQzqzSzDGAR8JGzcMxsfJ/JhcCGYH6hmWUGz0uA84D+B4BHjJ4bqG/84BB/fHdv2HFERIZs0NJ3907gFuAZImX+mLuvM7O7zGxhsNhXzWydmb0FfBW4Nph/KlAdzF8B3D3AWT8jyp+dPoGywmz+QzdQF5ERyOKtuKqqqry6ujrsGEf10Mpt/MMv3+HhG+Zx7tSSsOOIiGBmq4Pjp0elb+Qeg7+YU0ZJXqZusiIiI45K/xhkpady/QWVvLSpnrd27A87johI1FT6x+iv5k1iVFYa9zy/OewoIiJRU+kfo/ysdK49t4Jn1n3Apg8OhR1HRCQqKv3jcO15lWSnp3LvC9q3LyIjg0r/OBTlZrB47iR+tWY3Oxp1A3URiX8q/eN0w4WVpBgse1G3VBSR+KfSP07jC7K5+qwyflG9g72HdAN1EYlvKv0Y+PJFU+ns6uZHf9oadhQRkaNS6cdAZUkuV8waz89WbuNAS0fYcUREjkilHyM3zZ9GU1snP311a9hRRESOSKUfIzMmjOKSU8bwo5draWnXDdRFJD6p9GPo5ounsq+lg0df3zH4wiIiIVDpx9CcyUXMrSxi2Ys1tHfqlooiEn9U+jF288XTeP/gYZ56c2fYUUREPkalH2MXTi9h5sRR3PdCDV26paKIxBmVfoyZGTfPn0ZtfTO/f0c3UBeR+KLSHwafPm0cU0pzWbpii26pKCJxRaU/DFJSjK9cNJUNew7y/Ma6sOOIiPRS6Q+Tz545kYmjs1m6QjdZEZH4odIfJumpKSy5cArV2/bxem1j2HFERACV/rD6QlU5xbkZWtsXkbih0h9G2RmpfOn8Sl54r043UBeRuBBV6ZvZAjPbaGabzez2AV6/1szqzGxN8Li+z2vXmNmm4HFNLMOPBH/9icmU5GXw5YdWU1vfHHYcEUlyg5a+maUCS4HLgRnAYjObMcCiv3D3M4LHA8F7i4A7gXnAXOBOMyuMWfoRYFRWOj+7fh7tXd0sXraSrSp+EQlRNGv6c4HN7l7j7u3Ao8BVUf78TwPPuXuju+8DngMWHFvUkeuUcaN4+IZI8S9S8YtIiKIp/YlA38tG7gzm9Xe1mb1tZo+bWflQ3mtmS8ys2syq6+oS87z2U8aN4ufXq/hFJFyxOpD7a6DC3WcTWZt/cChvdvdl7l7l7lWlpaUxihR/Th0fKf62zi4W36/iF5ETL5rS3wWU95kuC+b1cvcGd28LJh8A5kT73mRz6vhRPHzDORzuiBT/tgYVv4icONGU/ipguplVmlkGsAhY3ncBMxvfZ3IhsCF4/gxwmZkVBgdwLwvmJbW+xb9omYpfRE6cQUvf3TuBW4iU9QbgMXdfZ2Z3mdnCYLGvmtk6M3sL+CpwbfDeRuA7RAaOVcBdwbykF9nVo+IXkRPL4u0qkFVVVV5dXR12jBNm/e6D/NUDK8lOT+WRJecwuTg37EgiMgKZ2Wp3rxpsOX0jN2QzJkTW+Fs6uli8bCXbG1rCjiQiCUylHwdmTBjFw0HxL1r2qopfRIaNSj9ORNb456n4RWRYqfTjyGkTCnqLf/H92tUjIrGn0o8zp00o4GfXzaO5vZPF969kR6OKX0RiR6Ufh2ZOjBR/U1sni5ap+EUkdlT6cWrmxMiuHhW/iMSSSj+OqfhFJNZU+nFOxS8isaTSHwFU/CISKyr9EaKn+A8d7tBZPSJyzFT6I0ik+M/hYGuk+HfuU/GLyNCo9EeYWWUfFv+iZSp+ERkalf4INKusgJ9dP0/FLyJDptIfoWaXje4tfu3qEZFoqfRHsJ7i398SKf5d+1vDjiQicU6lP8LNLhvNz4PiX7TsVRW/iByVSj8BzC4bzc+uU/GLyOBU+gni9PIPi3/xMu3qEZGBqfQTyOnlo3nounnsa2ln8bKV7Fbxi0g/Kv0Ec0ZP8Te3s0jFLyL9qPQT0Bnlo3noehW/iHycSj9B9S3+xfer+EUkIqrSN7MFZrbRzDab2e1HWe5qM3MzqwqmK8ys1czWBI/7YhVcBndG+Wh+et1cGpsixb/ngIpfJNkNWvpmlgosBS4HZgCLzWzGAMvlA18DXuv30hZ3PyN43BiDzDIEZ04q7C3+RctU/CLJLpo1/bnAZnevcfd24FHgqgGW+w7wPeBwDPNJDJw5qZAHVfwiQnSlPxHY0Wd6ZzCvl5mdBZS7+28HeH+lmb1pZi+Y2QUDfYCZLTGzajOrrquriza7DMFZQfE3NEVO53z/gMZmkWR03AdyzSwF+D7wjQFe3gNMcvczgVuBh81sVP+F3H2Zu1e5e1VpaenxRpIjOCvY1VPf1M6iZa+q+EWSUDSlvwso7zNdFszrkQ/MBJ43s63AOcByM6ty9zZ3bwBw99XAFuCkWASXY3PWpEIe/JKKXyRZRVP6q4DpZlZpZhnAImB5z4vufsDdS9y9wt0rgJXAQnevNrPS4EAwZjYFmA7UxPy3kCGZMzlS/HWH2rj1sTW4e9iRROQEGbT03b0TuAV4BtgAPObu68zsLjNbOMjbLwTeNrM1wOPAje7eeLyh5fjNmVzIN688lVe2NPBf1TvDjiMiJ4jF21peVVWVV1dXhx0jKXR3O4vuX8m7ew7yh1svYsyorLAjicgxMrPV7l412HL6Rm4SS0kx7v78LA53dnPn8nVhxxGRE0Cln+SmlObxtU9O5/fvvM/T77wfdhwRGWYqfWHJhVOYMX4U//tX73CgtSPsOCIyjFT6QnpqCt+7ejb1TW1893cbwo4jIsNIpS8AzCor4IYLpvDoqh28sqU+7DgiMkxU+tLr6586icnFOdzx5Fpa27vCjiMiw0ClL72yM1L57udnsa2hhR/84b2w44jIMFDpy0ecO7WERWeXc/9LNazdeSDsOCISYyp9+Zg7rjiVkrxMbnvibTq6usOOIyIxpNKXjynITueuq2ayfs9B7n9Jl0oSSSQqfRnQgpnjWHDaOH7wh03U1DWFHUdEYkSlL0d011WnkZWWwu1PrqW7O76u0SQix0alL0c0ZlQW37ryVF6vbeSRVdvDjiMiMaDSl6P6QlU5504t5u7fvasbrogkAJW+HJWZ8d3Pz6Kju5tv//Id3XBFZIRT6cugJhfncuulJ/GHDR/w27V7wo4jIsdBpS9R+dJ5lcyaWMA/Ll/Hvub2sOOIyDFS6UtU0oIrce5v6eCff6srcYqMVCp9idqMCaP48kVTeOKNnbz4Xl3YcUTkGKj0ZUj+7pLpTCnN5ZtPraWlvTPsOCIyRCp9GZKs9FTu/vxsdu5r5d+e1ZU4RUYalb4M2dzKIv5q3iR+/HIta3bsDzuOiAxBVKVvZgvMbKOZbTaz24+y3NVm5mZW1WfeHcH7NprZp2MRWsJ3++WnMCY/i9sef5v2Tl2JU2SkGLT0zSwVWApcDswAFpvZjAGWywe+BrzWZ94MYBFwGrAAuCf4eTLC5Wel88+fncnGDw5x3wtbwo4jIlGKZk1/LrDZ3WvcvR14FLhqgOW+A3wP6Ptd/auAR929zd1rgc3Bz5ME8KkZY/nM7PH8xx83s3nvobDjiEgUoin9icCOPtM7g3m9zOwsoNzdfzvU98rI9o8LTyMnM5XbntCVOEVGguM+kGtmKcD3gW8cx89YYmbVZlZdV6fzv0eSkrxM/uHKGazeto+HVm4LO46IDCKa0t8FlPeZLgvm9cgHZgLPm9lW4BxgeXAwd7D3AuDuy9y9yt2rSktLh/YbSOg+f9ZELphewr8+/S679reGHUdEjiKa0l8FTDezSjPLIHJgdnnPi+5+wN1L3L3C3SuAlcBCd68OlltkZplmVglMB16P+W8hoTIz/s/nZuHAt55aqytxisSxQUvf3TuBW4BngA3AY+6+zszuMrOFg7x3HfAYsB54GrjZ3buOP7bEm/KiHP7nZSfz/MY6frVmd9hxROQILN7Wyqqqqry6ujrsGHIMurqdq+99hW0Nzfzh1osozssMO5JI0jCz1e5eNdhy+kauxExqivGvfz6bprZOvvOb9WHHEZEBqPQlpk4am89X5k/jl2t2s2Lj3rDjiEg/Kn2JuZsvnsq0MXl868m1NLXpSpwi8USlLzGXmZbK966ezZ6Dh/m/T78bdhwR6UOlL8NizuRCrvlEBT9duY3V2xrDjiMiAZW+DJu///TJTCjI5rYn1tLWqTN1ReKBSl+GTW5mGv/yuZls3tvE0j9uDjuOiKDSl2E2/+QxfO7Midzz/Bbeff9g2HFEkp5KX4bdP3xmBqOy07ntibV06UqcIqFS6cuwK8rN4M4/m8FbO/bz45drw44jktRU+nJCLDx9ApecMoZ/e/Y9djS2hB1HJGmp9OWEMDP++bMzSTH4pq7EKRIalb6cMBNGZ3Pb5afw0qZ6nnjjY7dVEJETQKUvJ9QX502manIh3/nNeuoOtYUdRyTpqPTlhEpJMe6+ejat7V3846/XhR1HJOmo9OWEmzYmj7+7ZBq/fXsPz63/IOw4IklFpS+h+PJFUzllXD7f/uVaDh7uCDuOSNJQ6UsoMtJS+N7Vs6k71Mbdv9eVOEVOFJW+hOb08tF86bxKHn5tOytrGsKOI5IUVPoSqlsvO4nyomzueHIthzt0JU6R4abSl1DlZKTx3c/Npra+mX//701hxxFJeCp9Cd3500v4izllLHuxhnW7D4QdRyShqfQlLnzrylMpzMngtifeprOrO+w4IgkrqtI3swVmttHMNpvZ7QO8fqOZrTWzNWb2JzObEcyvMLPWYP4aM7sv1r+AJIbRORn808LTeGfXQX74J12JU2S4pA22gJmlAkuBS4GdwCozW+7u6/ss9rC73xcsvxD4PrAgeG2Lu58R29iSiK6YNY5LZ4zl+8+9R2VJLuVFOZTkZVKUm0FqioUdTyQhDFr6wFxgs7vXAJjZo8BVQG/pu3vfWyLlArqEogxZz5U4P/2DF1ny0Ore+SkWuSZ/SV4mJXmZFOd9+LwkL4OS/ExK+7yWnqq9liJHEk3pTwR29JneCczrv5CZ3QzcCmQAl/R5qdLM3gQOAt9295eOPa4kurGjsljxjfls2ttEfVNb5HGojbqm9t7pbdubqT/UTusRTvEcnZPeOyAU5/UMCH0GivwPp7PSU0/wbygSrmhKPyruvhRYamZ/CXwbuAbYA0xy9wYzmwP80sxO67dlgJktAZYATJo0KVaRZIQqzM1gbmXRoMs1t3V+ODD0DAqH2vvMa2P97oPUH2rjUFvngD8jPzPtI4PAR7ciMinN/3A6NzNm/1xEQhPN/8W7gPI+02XBvCN5FLgXwN3bgLbg+Woz2wKcBFT3fYO7LwOWAVRVVWnXkEQlNzON3Mw0JhfnDrrs4Y6u3sGhoc9AUXfowwFi094mXq1pY3/LwNcCyk5PpTAnnVHZwSMrnVHZaRT0Pk8Pnqd9+DyYzstMw0zHJSR80ZT+KmC6mVUSKftFwF/2XcDMprt7zzdrrgQ2BfNLgUZ37zKzKcB0oCZW4UWilZWeSllhDmWFOYMu297ZTWNzZIuhrqmNht6tiDb2t3ZwoLWDg60d7NrfyoY9kedH2pLokWL0DhSRwSCtz/PIwFDwkcEknYJgmVHZ6doNJTEzaOm7e6eZ3QI8A6QCP3L3dWZ2F1Dt7suBW8zsU0AHsI/Irh2AC4G7zKwD6AZudPfG4fhFRGIlIy2FcQVZjCvIivo9Xd1O0+HOyIBwODIQfPh8oPmd7D3Y1Dv/cMfRv5uQkZYSDBJpAw4MU0rzmFdZRFlhtrYo5Kgs3u5VWlVV5dXV1YMvKJJA2jq7ONja+bGBof/g0fP6weD1nq2Ozu7Iv+PxBVnMrSxibmUR8yqLmFqap0EgSZjZanevGmw5HZkSiQOZaamU5qdSmp855Pd2dzub9jbxem0Dr9U28sqWBn61ZjcAxbkZnF1R1DsQnDp+lL7zkORU+iIjXEqKcfK4fE4el89ff6ICd2dbQwuv1zbyWm0jr29t4Ol17wORs5WqKgqZW1nM3MoiZk0sICNN32tIJip9kQRjZlSU5FJRkssXzo6ceLd7fyurtgaDQG0jKzZGblyTlZ7CWZMKe7cEziwvJDtDB40TmfbpiySh+qY2qvsMAuv3HMQd0lON2WWjeweBOZMLGZWVHnZciUK0+/RV+iLCgdYO3ti2LxgEGnh75wE6u50UgxkTRjG3IrI76OyKQorzhn7cQYafSl9EjllLeydrtu/v3RJ4Y/s+2jojp5VOH5PXuyUwt7KI8QXZIacVUOmLSAy1dXbxzq4DvYNA9dZ9NAVfSJtUlPOR00QnFeXoNNEQqPRFZNh0dnXz7vuHencHvV7byL7g8hVjR2VydkVkAJg5sYDKklxG52SEnDj+dHR1s72xhS17m6ipb6amronC3AzuuPzUY/p5Ok9fRIZNWmoKMycWMHNiAdedX0l3t7Olrql3S+C12gZ+8/ae3uVH56QzuTiXyuKcyJlFxZGziyqLcynISewDxY3N7dTUNVFT18yWuia21EUKfntjS++X6gBK8zM5f1rJsOfRmr6IxJy7s6OxlY0fHGJrfTO1Dc1sa2hma30Luw+00rd2RuekU1GcS2VJLpOLc6jsGRRG0IDQ0dXNtoaWSLnXN39k7X1fnwv4ZaSmUFGSw9TSPKaU5gZ/Rp4f71lSWtMXkdCYGZOKc5hU/PEL3B3u6GJHYwu19c1sbWhma0MLW+ubea2mgafe/OgFfAtz0j/cMijOpaIkp3croSD7xA8Ijc3tbKlr+siae01d84Br7VNKcrl81nimlETKfWppHhMLs0P/RrRKX0ROqKz0VKaPzWf62PyPvXa4o4vtwYCwraGZ2vojDwhFuRmRLYPiXCYHA0Jka+H4BoS+a+09u2Jq6iMFv7/fWntlSS4nj8vn8lnjYrrWPpxU+iISN7LSUzlpbD4nHWFA2NbQEtk66NlKqG/h1ZoGnhxgQKgo/nCrILK1EDme0FPIfdfae8u9rpltjS10DbDWfkXPWvuYPKaWxMda+7FQ6YvIiJCVntp7jaH+Wts/3ELY2tCzldDMK1s+PiAU52bQ5X7EtfYrZo1nSmnuiFhrPxYqfREZ8bIzjj4gbGvs2TqI7C5KSbGEWGs/Fip9EUlo2RmpnDJuFKeMGxV2lLiga6qKiCQRlb6ISBJR6YuIJBGVvohIElHpi4gkEZW+iEgSUemLiCQRlb6ISBKJu0srm1kdsO04fkQJUB+jOLGkXEOjXEOjXEOTiLkmu3vpYAvFXekfLzOrjuaa0ieacg2Ncg2Ncg1NMufS7h0RkSSi0hcRSSKJWPrLwg5wBMo1NMo1NMo1NEmbK+H26YuIyJEl4pq+iIgcQUKUvpmVm9kKM1tvZuvM7GthZwIwsywze93M3gpy/VPYmfoys1Qze9PMfhN2lh5mttXM1prZGjOrDjtPDzMbbWaPm9m7ZrbBzD4RdiYAMzs5+LvqeRw0s6/HQa7/Efw//46ZPWJmWWFnAjCzrwWZ1oX992RmPzKzvWb2Tp95RWb2nJltCv4sjPXnJkTpA53AN9x9BnAOcLOZzQg5E0AbcIm7nw6cASwws3NCztTX14ANYYcYwMXufkacnVL378DT7n4KcDpx8vfm7huDv6szgDlAC/BUmJnMbCLwVaDK3WcCqcCiMDMBmNlM4AZgLpH/hp8xs2khRvoJsKDfvNuB/3b36cB/B9MxlRCl7+573P2N4PkhIv8gJ4abCjyiKZhMDx5xcRDFzMqAK4EHws4S78ysALgQ+CGAu7e7+/5wUw3ok8AWdz+eLzfGShqQbWZpQA6wO+Q8AKcCr7l7i7t3Ai8Anw8rjLu/CDT2m30V8GDw/EHgs7H+3IQo/b7MrAI4E3gt3CQRwS6UNcBe4Dl3j4tcwA+A/wV0hx2kHweeNbPVZrYk7DCBSqAO+HGwO+wBM8sNO9QAFgGPhB3C3XcB/w/YDuwBDrj7s+GmAuAd4AIzKzazHOAKoDzkTP2Ndfc9wfP3gbGx/oCEKn0zywOeAL7u7gfDzgPg7l3BpncZMDfYxAyVmX0G2Ovuq8POMoDz3f0s4HIiu+kuDDsQkbXWs4B73f1MoJlh2Ow+HmaWASwE/isOshQSWWOtBCYAuWb2xXBTgbtvAL4HPAs8DawBukINdRQeObUy5nsGEqb0zSydSOH/3N2fDDtPf8HugBV8fB9eGM4DFprZVuBR4BIz+1m4kSKCtUTcfS+RfdNzw00EwE5gZ5+ttMeJDALx5HLgDXf/IOwgwKeAWnevc/cO4Eng3JAzAeDuP3T3Oe5+IbAPeC/sTP18YGbjAYI/98b6AxKi9M3MiOxv3eDu3w87Tw8zKzWz0cHzbOBS4N1wU4G73+HuZe5eQWSXwB/dPfQ1MTPLNbP8nufAZUQ2yUMwrFoNAAAA+UlEQVTl7u8DO8zs5GDWJ4H1IUYayGLiYNdOYDtwjpnlBP82P0mcHPg2szHBn5OI7M9/ONxEH7McuCZ4fg3wq1h/QFqsf2BIzgP+Glgb7D8H+Ka7/y7ETADjgQfNLJXIAPuYu8fN6ZFxaCzwVKQnSAMedvenw43U6++Anwe7UWqAvw05T69ggLwU+HLYWQDc/TUzexx4g8iZdW8SP9+AfcLMioEO4OYwD8ib2SPAfKDEzHYCdwJ3A4+Z2XVErjb8hZh/rr6RKyKSPBJi946IiERHpS8ikkRU+iIiSUSlLyKSRFT6IiJJRKUvIpJEVPoiIklEpS8ikkT+P5PKA5/jzzYaAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(range(2,11),silhouette_score_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do you get a similar answer as with the elbow method?\n", "\n", "Now try using the silhouette score to find the best value of k for the labor data." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD8CAYAAACb4nSYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3XmYVPWd7/H3t/eVpjfWbuhmcUHAhRaMK5poUDOYxJk8MJMZnajEqJPkmjtXTTLXGTNzY+beyZN5nkEdNIsxUeO4JGRzyQSXqCiNogiIQDc7Si9svdDr9/5Rp9u2behqqOZUV31ez1MPdU6d6vo0yud3tjrH3B0REUkOKWEHEBGRE0elLyKSRFT6IiJJRKUvIpJEVPoiIklEpS8ikkRU+iIiSUSlLyKSRFT6IiJJJC3sAP2VlJR4RUVF2DFEREaU1atX17t76WDLxV3pV1RUUF1dHXYMEZERxcy2RbOcdu+IiCQRlb6ISBJR6YuIJBGVvohIElHpi4gkEZW+iEgSUemLiCSRhCn9A60d/NuzG9lS1xR2FBGRuJUwpd/R1c39L9Vw3/Nbwo4iIhK3Eqb0S/IyWXT2JJ56cxe79reGHUdEJC4lTOkDLLlwCgD3v1gTchIRkfiUUKU/YXQ2nztzIo+8vp36praw44iIxJ2EKn2AG+dPpb2rmx+/XBt2FBGRuJNwpT+1NI8rZo7np69s4+DhjrDjiIjElYQrfYCvzJ/KobZOHno1qiuNiogkjYQs/ZkTC7jopFJ+9KdaWtu7wo4jIhI3ErL0AW6+eBoNze08Vr0j7CgiInEjYUt/bmURZ1cU8p8vbKG9szvsOCIicSFhSx/gpounsfvAYX61ZlfYUURE4kJCl/78k0qZMX4U976wha5uDzuOiEjoErr0zYybLp5KTV0zz657P+w4IiKhS+jSB7h85ngqS3JZ+vxm3LW2LyLJLeFLPzXFuPGiKbyz6yAvbqoPO46ISKgSvvQBPndmGeMLsli6YnPYUUREQhVV6ZvZAjPbaGabzez2IyzzBTNbb2brzOzhPvOvMbNNweOaWAUfioy0FG64YAqv1zZSvbUxjAgiInFh0NI3s1RgKXA5MANYbGYz+i0zHbgDOM/dTwO+HswvAu4E5gFzgTvNrDCmv0GUFs0tpyg3g3t0kxURSWLRrOnPBTa7e427twOPAlf1W+YGYKm77wNw973B/E8Dz7l7Y/Dac8CC2EQfmpyMNP723Ar++O5e1u0+EEYEEZHQRVP6E4G+1zLYGczr6yTgJDN72cxWmtmCIbwXM1tiZtVmVl1XVxd9+iH6m09UkJeZxr1a2xeRJBWrA7lpwHRgPrAYuN/MRkf7Zndf5u5V7l5VWloao0gfV5CTzhfPmczv1u6htr552D5HRCReRVP6u4DyPtNlwby+dgLL3b3D3WuB94gMAtG894S67vxK0lNT+M8XtLYvIsknmtJfBUw3s0ozywAWAcv7LfNLImv5mFkJkd09NcAzwGVmVhgcwL0smBea0vxMvlBVzhNv7GTPAd1AXUSSy6Cl7+6dwC1EynoD8Ji7rzOzu8xsYbDYM0CDma0HVgB/7+4N7t4IfIfIwLEKuCuYF6olF06h2+H+F3VLRRFJLhZvlyaoqqry6urqYf+cWx9bw+/Xvs/Lt19CUW7GsH+eiMhwMrPV7l412HJJ8Y3cgdw0fyqHO7v4iW6gLiJJJGlLf9qYfC6bMZafvLKVQ7qBuogkiaQtfYCb5k/j4OFOfv7a9rCjiIicEEld+qeXj+aC6SU88FIthzt0A3URSXxJXfoQWduvb2rjv1bvDDuKiMiwS/rSP2dKEWdOGs1/vrCFji7dQF1EElvSl76ZcfP8aezc18qv39oddhwRkWGV9KUPcMkpYzhlXD73Pr+Fbt1AXUQSmEofSEkxvjJ/Kpv2NvHchg/CjiMiMmxU+oErZ41nUlEO96zQDdRFJHGp9ANpqSnceNFU3tp5gJc3N4QdR0RkWKj0+7h6zkTG5GfqBuoikrBU+n1kpqWy5MIpvFrTwBvb94UdR0Qk5lT6/SyeO4nROencs0I3WRGRxKPS7yc3M41rz63gDxs+4N33D4YdR0QkplT6A7j23ApyMlJ1A3URSTgq/QGMzsngi+dM5tdv7WZ7Q0vYcUREYkalfwTXnV9JWkoK972otX0RSRwq/SMYOyqLP68q4/HqnXxw8HDYcUREYkKlfxQ3XjiVzu5uHnipJuwoIiIxodI/iknFOSw8fQI/f207+1vaw44jInLcVPqD+Mr8abS0d/GTV7aGHUVE5Lip9Adx8rh8PnXqWH788laa2jrDjiMiclyiKn0zW2BmG81ss5ndPsDr15pZnZmtCR7X93mtq8/85bEMf6LcdPFUDrR28IhuoC4iI1zaYAuYWSqwFLgU2AmsMrPl7r6+36K/cPdbBvgRre5+xvFHDc9Zkwo5d2ox979Uw9+cO5nMtNSwI4mIHJNo1vTnApvdvcbd24FHgauGN1b8uWn+NPYeauOJ1bvCjiIicsyiKf2JwI4+0zuDef1dbWZvm9njZlbeZ36WmVWb2Uoz++xAH2BmS4Jlquvq6qJPfwKdN62Y08sKuO+FLXTqBuoiMkLF6kDur4EKd58NPAc82Oe1ye5eBfwl8AMzm9r/ze6+zN2r3L2qtLQ0RpFiy8y46eJpbG9s4bdr94QdR0TkmERT+ruAvmvuZcG8Xu7e4O5tweQDwJw+r+0K/qwBngfOPI68obr01LFMH5PHPSt0A3URGZmiKf1VwHQzqzSzDGAR8JGzcMxsfJ/JhcCGYH6hmWUGz0uA84D+B4BHjJ4bqG/84BB/fHdv2HFERIZs0NJ3907gFuAZImX+mLuvM7O7zGxhsNhXzWydmb0FfBW4Nph/KlAdzF8B3D3AWT8jyp+dPoGywmz+QzdQF5ERyOKtuKqqqry6ujrsGEf10Mpt/MMv3+HhG+Zx7tSSsOOIiGBmq4Pjp0elb+Qeg7+YU0ZJXqZusiIiI45K/xhkpady/QWVvLSpnrd27A87johI1FT6x+iv5k1iVFYa9zy/OewoIiJRU+kfo/ysdK49t4Jn1n3Apg8OhR1HRCQqKv3jcO15lWSnp3LvC9q3LyIjg0r/OBTlZrB47iR+tWY3Oxp1A3URiX8q/eN0w4WVpBgse1G3VBSR+KfSP07jC7K5+qwyflG9g72HdAN1EYlvKv0Y+PJFU+ns6uZHf9oadhQRkaNS6cdAZUkuV8waz89WbuNAS0fYcUREjkilHyM3zZ9GU1snP311a9hRRESOSKUfIzMmjOKSU8bwo5draWnXDdRFJD6p9GPo5ounsq+lg0df3zH4wiIiIVDpx9CcyUXMrSxi2Ys1tHfqlooiEn9U+jF288XTeP/gYZ56c2fYUUREPkalH2MXTi9h5sRR3PdCDV26paKIxBmVfoyZGTfPn0ZtfTO/f0c3UBeR+KLSHwafPm0cU0pzWbpii26pKCJxRaU/DFJSjK9cNJUNew7y/Ma6sOOIiPRS6Q+Tz545kYmjs1m6QjdZEZH4odIfJumpKSy5cArV2/bxem1j2HFERACV/rD6QlU5xbkZWtsXkbih0h9G2RmpfOn8Sl54r043UBeRuBBV6ZvZAjPbaGabzez2AV6/1szqzGxN8Li+z2vXmNmm4HFNLMOPBH/9icmU5GXw5YdWU1vfHHYcEUlyg5a+maUCS4HLgRnAYjObMcCiv3D3M4LHA8F7i4A7gXnAXOBOMyuMWfoRYFRWOj+7fh7tXd0sXraSrSp+EQlRNGv6c4HN7l7j7u3Ao8BVUf78TwPPuXuju+8DngMWHFvUkeuUcaN4+IZI8S9S8YtIiKIp/YlA38tG7gzm9Xe1mb1tZo+bWflQ3mtmS8ys2syq6+oS87z2U8aN4ufXq/hFJFyxOpD7a6DC3WcTWZt/cChvdvdl7l7l7lWlpaUxihR/Th0fKf62zi4W36/iF5ETL5rS3wWU95kuC+b1cvcGd28LJh8A5kT73mRz6vhRPHzDORzuiBT/tgYVv4icONGU/ipguplVmlkGsAhY3ncBMxvfZ3IhsCF4/gxwmZkVBgdwLwvmJbW+xb9omYpfRE6cQUvf3TuBW4iU9QbgMXdfZ2Z3mdnCYLGvmtk6M3sL+CpwbfDeRuA7RAaOVcBdwbykF9nVo+IXkRPL4u0qkFVVVV5dXR12jBNm/e6D/NUDK8lOT+WRJecwuTg37EgiMgKZ2Wp3rxpsOX0jN2QzJkTW+Fs6uli8bCXbG1rCjiQiCUylHwdmTBjFw0HxL1r2qopfRIaNSj9ORNb456n4RWRYqfTjyGkTCnqLf/H92tUjIrGn0o8zp00o4GfXzaO5vZPF969kR6OKX0RiR6Ufh2ZOjBR/U1sni5ap+EUkdlT6cWrmxMiuHhW/iMSSSj+OqfhFJNZU+nFOxS8isaTSHwFU/CISKyr9EaKn+A8d7tBZPSJyzFT6I0ik+M/hYGuk+HfuU/GLyNCo9EeYWWUfFv+iZSp+ERkalf4INKusgJ9dP0/FLyJDptIfoWaXje4tfu3qEZFoqfRHsJ7i398SKf5d+1vDjiQicU6lP8LNLhvNz4PiX7TsVRW/iByVSj8BzC4bzc+uU/GLyOBU+gni9PIPi3/xMu3qEZGBqfQTyOnlo3nounnsa2ln8bKV7Fbxi0g/Kv0Ec0ZP8Te3s0jFLyL9qPQT0Bnlo3noehW/iHycSj9B9S3+xfer+EUkIqrSN7MFZrbRzDab2e1HWe5qM3MzqwqmK8ys1czWBI/7YhVcBndG+Wh+et1cGpsixb/ngIpfJNkNWvpmlgosBS4HZgCLzWzGAMvlA18DXuv30hZ3PyN43BiDzDIEZ04q7C3+RctU/CLJLpo1/bnAZnevcfd24FHgqgGW+w7wPeBwDPNJDJw5qZAHVfwiQnSlPxHY0Wd6ZzCvl5mdBZS7+28HeH+lmb1pZi+Y2QUDfYCZLTGzajOrrquriza7DMFZQfE3NEVO53z/gMZmkWR03AdyzSwF+D7wjQFe3gNMcvczgVuBh81sVP+F3H2Zu1e5e1VpaenxRpIjOCvY1VPf1M6iZa+q+EWSUDSlvwso7zNdFszrkQ/MBJ43s63AOcByM6ty9zZ3bwBw99XAFuCkWASXY3PWpEIe/JKKXyRZRVP6q4DpZlZpZhnAImB5z4vufsDdS9y9wt0rgJXAQnevNrPS4EAwZjYFmA7UxPy3kCGZMzlS/HWH2rj1sTW4e9iRROQEGbT03b0TuAV4BtgAPObu68zsLjNbOMjbLwTeNrM1wOPAje7eeLyh5fjNmVzIN688lVe2NPBf1TvDjiMiJ4jF21peVVWVV1dXhx0jKXR3O4vuX8m7ew7yh1svYsyorLAjicgxMrPV7l412HL6Rm4SS0kx7v78LA53dnPn8nVhxxGRE0Cln+SmlObxtU9O5/fvvM/T77wfdhwRGWYqfWHJhVOYMX4U//tX73CgtSPsOCIyjFT6QnpqCt+7ejb1TW1893cbwo4jIsNIpS8AzCor4IYLpvDoqh28sqU+7DgiMkxU+tLr6586icnFOdzx5Fpa27vCjiMiw0ClL72yM1L57udnsa2hhR/84b2w44jIMFDpy0ecO7WERWeXc/9LNazdeSDsOCISYyp9+Zg7rjiVkrxMbnvibTq6usOOIyIxpNKXjynITueuq2ayfs9B7n9Jl0oSSSQqfRnQgpnjWHDaOH7wh03U1DWFHUdEYkSlL0d011WnkZWWwu1PrqW7O76u0SQix0alL0c0ZlQW37ryVF6vbeSRVdvDjiMiMaDSl6P6QlU5504t5u7fvasbrogkAJW+HJWZ8d3Pz6Kju5tv//Id3XBFZIRT6cugJhfncuulJ/GHDR/w27V7wo4jIsdBpS9R+dJ5lcyaWMA/Ll/Hvub2sOOIyDFS6UtU0oIrce5v6eCff6srcYqMVCp9idqMCaP48kVTeOKNnbz4Xl3YcUTkGKj0ZUj+7pLpTCnN5ZtPraWlvTPsOCIyRCp9GZKs9FTu/vxsdu5r5d+e1ZU4RUYalb4M2dzKIv5q3iR+/HIta3bsDzuOiAxBVKVvZgvMbKOZbTaz24+y3NVm5mZW1WfeHcH7NprZp2MRWsJ3++WnMCY/i9sef5v2Tl2JU2SkGLT0zSwVWApcDswAFpvZjAGWywe+BrzWZ94MYBFwGrAAuCf4eTLC5Wel88+fncnGDw5x3wtbwo4jIlGKZk1/LrDZ3WvcvR14FLhqgOW+A3wP6Ptd/auAR929zd1rgc3Bz5ME8KkZY/nM7PH8xx83s3nvobDjiEgUoin9icCOPtM7g3m9zOwsoNzdfzvU98rI9o8LTyMnM5XbntCVOEVGguM+kGtmKcD3gW8cx89YYmbVZlZdV6fzv0eSkrxM/uHKGazeto+HVm4LO46IDCKa0t8FlPeZLgvm9cgHZgLPm9lW4BxgeXAwd7D3AuDuy9y9yt2rSktLh/YbSOg+f9ZELphewr8+/S679reGHUdEjiKa0l8FTDezSjPLIHJgdnnPi+5+wN1L3L3C3SuAlcBCd68OlltkZplmVglMB16P+W8hoTIz/s/nZuHAt55aqytxisSxQUvf3TuBW4BngA3AY+6+zszuMrOFg7x3HfAYsB54GrjZ3buOP7bEm/KiHP7nZSfz/MY6frVmd9hxROQILN7Wyqqqqry6ujrsGHIMurqdq+99hW0Nzfzh1osozssMO5JI0jCz1e5eNdhy+kauxExqivGvfz6bprZOvvOb9WHHEZEBqPQlpk4am89X5k/jl2t2s2Lj3rDjiEg/Kn2JuZsvnsq0MXl868m1NLXpSpwi8USlLzGXmZbK966ezZ6Dh/m/T78bdhwR6UOlL8NizuRCrvlEBT9duY3V2xrDjiMiAZW+DJu///TJTCjI5rYn1tLWqTN1ReKBSl+GTW5mGv/yuZls3tvE0j9uDjuOiKDSl2E2/+QxfO7Midzz/Bbeff9g2HFEkp5KX4bdP3xmBqOy07ntibV06UqcIqFS6cuwK8rN4M4/m8FbO/bz45drw44jktRU+nJCLDx9ApecMoZ/e/Y9djS2hB1HJGmp9OWEMDP++bMzSTH4pq7EKRIalb6cMBNGZ3Pb5afw0qZ6nnjjY7dVEJETQKUvJ9QX502manIh3/nNeuoOtYUdRyTpqPTlhEpJMe6+ejat7V3846/XhR1HJOmo9OWEmzYmj7+7ZBq/fXsPz63/IOw4IklFpS+h+PJFUzllXD7f/uVaDh7uCDuOSNJQ6UsoMtJS+N7Vs6k71Mbdv9eVOEVOFJW+hOb08tF86bxKHn5tOytrGsKOI5IUVPoSqlsvO4nyomzueHIthzt0JU6R4abSl1DlZKTx3c/Npra+mX//701hxxFJeCp9Cd3500v4izllLHuxhnW7D4QdRyShqfQlLnzrylMpzMngtifeprOrO+w4IgkrqtI3swVmttHMNpvZ7QO8fqOZrTWzNWb2JzObEcyvMLPWYP4aM7sv1r+AJIbRORn808LTeGfXQX74J12JU2S4pA22gJmlAkuBS4GdwCozW+7u6/ss9rC73xcsvxD4PrAgeG2Lu58R29iSiK6YNY5LZ4zl+8+9R2VJLuVFOZTkZVKUm0FqioUdTyQhDFr6wFxgs7vXAJjZo8BVQG/pu3vfWyLlArqEogxZz5U4P/2DF1ny0Ore+SkWuSZ/SV4mJXmZFOd9+LwkL4OS/ExK+7yWnqq9liJHEk3pTwR29JneCczrv5CZ3QzcCmQAl/R5qdLM3gQOAt9295eOPa4kurGjsljxjfls2ttEfVNb5HGojbqm9t7pbdubqT/UTusRTvEcnZPeOyAU5/UMCH0GivwPp7PSU0/wbygSrmhKPyruvhRYamZ/CXwbuAbYA0xy9wYzmwP80sxO67dlgJktAZYATJo0KVaRZIQqzM1gbmXRoMs1t3V+ODD0DAqH2vvMa2P97oPUH2rjUFvngD8jPzPtI4PAR7ciMinN/3A6NzNm/1xEQhPN/8W7gPI+02XBvCN5FLgXwN3bgLbg+Woz2wKcBFT3fYO7LwOWAVRVVWnXkEQlNzON3Mw0JhfnDrrs4Y6u3sGhoc9AUXfowwFi094mXq1pY3/LwNcCyk5PpTAnnVHZwSMrnVHZaRT0Pk8Pnqd9+DyYzstMw0zHJSR80ZT+KmC6mVUSKftFwF/2XcDMprt7zzdrrgQ2BfNLgUZ37zKzKcB0oCZW4UWilZWeSllhDmWFOYMu297ZTWNzZIuhrqmNht6tiDb2t3ZwoLWDg60d7NrfyoY9kedH2pLokWL0DhSRwSCtz/PIwFDwkcEknYJgmVHZ6doNJTEzaOm7e6eZ3QI8A6QCP3L3dWZ2F1Dt7suBW8zsU0AHsI/Irh2AC4G7zKwD6AZudPfG4fhFRGIlIy2FcQVZjCvIivo9Xd1O0+HOyIBwODIQfPh8oPmd7D3Y1Dv/cMfRv5uQkZYSDBJpAw4MU0rzmFdZRFlhtrYo5Kgs3u5VWlVV5dXV1YMvKJJA2jq7ONja+bGBof/g0fP6weD1nq2Ozu7Iv+PxBVnMrSxibmUR8yqLmFqap0EgSZjZanevGmw5HZkSiQOZaamU5qdSmp855Pd2dzub9jbxem0Dr9U28sqWBn61ZjcAxbkZnF1R1DsQnDp+lL7zkORU+iIjXEqKcfK4fE4el89ff6ICd2dbQwuv1zbyWm0jr29t4Ol17wORs5WqKgqZW1nM3MoiZk0sICNN32tIJip9kQRjZlSU5FJRkssXzo6ceLd7fyurtgaDQG0jKzZGblyTlZ7CWZMKe7cEziwvJDtDB40TmfbpiySh+qY2qvsMAuv3HMQd0lON2WWjeweBOZMLGZWVHnZciUK0+/RV+iLCgdYO3ti2LxgEGnh75wE6u50UgxkTRjG3IrI76OyKQorzhn7cQYafSl9EjllLeydrtu/v3RJ4Y/s+2jojp5VOH5PXuyUwt7KI8QXZIacVUOmLSAy1dXbxzq4DvYNA9dZ9NAVfSJtUlPOR00QnFeXoNNEQqPRFZNh0dnXz7vuHencHvV7byL7g8hVjR2VydkVkAJg5sYDKklxG52SEnDj+dHR1s72xhS17m6ipb6amronC3AzuuPzUY/p5Ok9fRIZNWmoKMycWMHNiAdedX0l3t7Olrql3S+C12gZ+8/ae3uVH56QzuTiXyuKcyJlFxZGziyqLcynISewDxY3N7dTUNVFT18yWuia21EUKfntjS++X6gBK8zM5f1rJsOfRmr6IxJy7s6OxlY0fHGJrfTO1Dc1sa2hma30Luw+00rd2RuekU1GcS2VJLpOLc6jsGRRG0IDQ0dXNtoaWSLnXN39k7X1fnwv4ZaSmUFGSw9TSPKaU5gZ/Rp4f71lSWtMXkdCYGZOKc5hU/PEL3B3u6GJHYwu19c1sbWhma0MLW+ubea2mgafe/OgFfAtz0j/cMijOpaIkp3croSD7xA8Ijc3tbKlr+siae01d84Br7VNKcrl81nimlETKfWppHhMLs0P/RrRKX0ROqKz0VKaPzWf62PyPvXa4o4vtwYCwraGZ2vojDwhFuRmRLYPiXCYHA0Jka+H4BoS+a+09u2Jq6iMFv7/fWntlSS4nj8vn8lnjYrrWPpxU+iISN7LSUzlpbD4nHWFA2NbQEtk66NlKqG/h1ZoGnhxgQKgo/nCrILK1EDme0FPIfdfae8u9rpltjS10DbDWfkXPWvuYPKaWxMda+7FQ6YvIiJCVntp7jaH+Wts/3ELY2tCzldDMK1s+PiAU52bQ5X7EtfYrZo1nSmnuiFhrPxYqfREZ8bIzjj4gbGvs2TqI7C5KSbGEWGs/Fip9EUlo2RmpnDJuFKeMGxV2lLiga6qKiCQRlb6ISBJR6YuIJBGVvohIElHpi4gkEZW+iEgSUemLiCQRlb6ISBKJu0srm1kdsO04fkQJUB+jOLGkXEOjXEOjXEOTiLkmu3vpYAvFXekfLzOrjuaa0ieacg2Ncg2Ncg1NMufS7h0RkSSi0hcRSSKJWPrLwg5wBMo1NMo1NMo1NEmbK+H26YuIyJEl4pq+iIgcQUKUvpmVm9kKM1tvZuvM7GthZwIwsywze93M3gpy/VPYmfoys1Qze9PMfhN2lh5mttXM1prZGjOrDjtPDzMbbWaPm9m7ZrbBzD4RdiYAMzs5+LvqeRw0s6/HQa7/Efw//46ZPWJmWWFnAjCzrwWZ1oX992RmPzKzvWb2Tp95RWb2nJltCv4sjPXnJkTpA53AN9x9BnAOcLOZzQg5E0AbcIm7nw6cASwws3NCztTX14ANYYcYwMXufkacnVL378DT7n4KcDpx8vfm7huDv6szgDlAC/BUmJnMbCLwVaDK3WcCqcCiMDMBmNlM4AZgLpH/hp8xs2khRvoJsKDfvNuB/3b36cB/B9MxlRCl7+573P2N4PkhIv8gJ4abCjyiKZhMDx5xcRDFzMqAK4EHws4S78ysALgQ+CGAu7e7+/5wUw3ok8AWdz+eLzfGShqQbWZpQA6wO+Q8AKcCr7l7i7t3Ai8Anw8rjLu/CDT2m30V8GDw/EHgs7H+3IQo/b7MrAI4E3gt3CQRwS6UNcBe4Dl3j4tcwA+A/wV0hx2kHweeNbPVZrYk7DCBSqAO+HGwO+wBM8sNO9QAFgGPhB3C3XcB/w/YDuwBDrj7s+GmAuAd4AIzKzazHOAKoDzkTP2Ndfc9wfP3gbGx/oCEKn0zywOeAL7u7gfDzgPg7l3BpncZMDfYxAyVmX0G2Ovuq8POMoDz3f0s4HIiu+kuDDsQkbXWs4B73f1MoJlh2Ow+HmaWASwE/isOshQSWWOtBCYAuWb2xXBTgbtvAL4HPAs8DawBukINdRQeObUy5nsGEqb0zSydSOH/3N2fDDtPf8HugBV8fB9eGM4DFprZVuBR4BIz+1m4kSKCtUTcfS+RfdNzw00EwE5gZ5+ttMeJDALx5HLgDXf/IOwgwKeAWnevc/cO4Eng3JAzAeDuP3T3Oe5+IbAPeC/sTP18YGbjAYI/98b6AxKi9M3MiOxv3eDu3w87Tw8zKzWz0cHzbOBS4N1wU4G73+HuZe5eQWSXwB/dPfQ1MTPLNbP8nufAZUQ2yUMwrFoNAAAA+UlEQVTl7u8DO8zs5GDWJ4H1IUYayGLiYNdOYDtwjpnlBP82P0mcHPg2szHBn5OI7M9/ONxEH7McuCZ4fg3wq1h/QFqsf2BIzgP+Glgb7D8H+Ka7/y7ETADjgQfNLJXIAPuYu8fN6ZFxaCzwVKQnSAMedvenw43U6++Anwe7UWqAvw05T69ggLwU+HLYWQDc/TUzexx4g8iZdW8SP9+AfcLMioEO4OYwD8ib2SPAfKDEzHYCdwJ3A4+Z2XVErjb8hZh/rr6RKyKSPBJi946IiERHpS8ikkRU+iIiSUSlLyKSRFT6IiJJRKUvIpJEVPoiIklEpS8ikkT+P5PKA5/jzzYaAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(range(2,11),silhouette_score_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How does the k giving the best silhouette score compare to the elbow method?\n", "\n", "## Starbucks drinks dataset\n", "\n", "Try both the elbow method and the silhouette score to compute the optimum number of clusters for Starbucks drinks, based on their nutritional information. The original dataset is from Kaggle [here]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.8" } }, "nbformat": 4, "nbformat_minor": 2 }